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Preface 


This book presents a general introduction to enumerative combinatorics that emphasizes 
bijective methods. The text contains a systematic development of the mathematical tools 
needed to solve enumeration problems: basic counting rules, recursions, inclusion-exclusion 
techniques, generating functions, bijective proofs, and linear-algebraic methods. These tools 
are used to analyze many combinatorial structures including words, permutations, sub- 
sets, functions, compositions, integer partitions, graphs, trees, lattice paths, multisets, rook 
placements, set partitions, Eulerian tours, derangements, posets, tilings, and abaci. Later 
chapters delve into some of the algebraic aspects of combinatorics, including detailed treat- 
ments of formal power series, symmetric groups, group actions, symmetric polynomials, 
determinants, and the combinatorial calculus of tableaux. 

This text is suitable for enumerative combinatorics courses at the beginning graduate 
or advanced undergraduate levels. The book is somewhat more advanced than standard 
undergraduate texts on discrete mathematics, but is less mathematically demanding than 
the technical masterpieces in the subject (e.g., Stanley’s two-volume treatise on enumerati- 
ion [127] or Macdonald’s monograph on symmetric functions [89]). There should be ample 
material in the book for a year-long course. A one-semester introduction to combinatorics 
might cover most of the first eight chapters, possibly excluding Chapter 3 if there is a sep- 
arate course offered on graph theory. The more technical aspects of Chapter 7 (on formal 
power series) can be skipped or skimmed over if pressed for time. A course emphasizing 
abstract algebra and its applications to combinatorics could be based on Chapters 2, 7, 9, 
10, and 11. Chapter 12 consists of independent sections on optional topics that complement 
material in the main text. In many chapters, some of the later sections can be omitted 
without loss of continuity. 

In principle, the text requires no mathematical prerequisites except for a familiarity 
with basic logic, set theory, and proof techniques. Certain sections of the book (which can 
be skipped in more elementary courses) do assume the reader has had some exposure to 
ideas from linear algebra, such as linear independence and bases. The chapters dealing with 
abstract algebraic structures (groups, rings, fields, vector spaces, and formal power series) 
are self-contained, providing all relevant definitions as they are needed. Thus, students and 
scholars with no prior background in algebra or combinatorics can profitably use this book 
for reference or self-study. 

Each chapter ends with a summary, a set of exercises, and bibliographic notes. The book 
contains nearly one thousand exercises, ranging in difficulty from routine verifications to 
unsolved problems. Solutions, hints, or partial answers to many of these exercises are given 
in an appendix. Although we provide references to the literature for some of the major 
theorems and harder problems, no attempt has been made to pinpoint the original source 
for every result appearing in the text and exercises. 


Iam grateful to several anonymous reviewers for valuable feedback and comments on an 
early version of the manuscript, and to Bob Stern and the other editors and staff at CRC 
Press for their facilitation of the publication process. The dedicated efforts of copyeditors 
and proofreaders removed many errors from this text, but I bear full responsibility for those 
that remain. Readers may communicate errors and other comments to the author by sending 


xiii 


Xiv 


e-mail to nloehr@vt.edu. Errata and other pertinent information will be maintained on the 
book’s website: http://www.math.vt.edu/people/nloehr/bijbook. html 

I thank E. Brown, H. Freeman, F. Loehr, L. Lopez, A. Mendes, and G. Warrington for 
their advice and support during the writing of this book. Finally, I wish to mention some 
very special people who died before this book could be completed: Julie, Elina, and 32 of 
my fellow students and faculty at Virginia Tech who were lost three years ago on this date. 


Nicholas A. Loehr 
April 16, 2010 


EPIGRAPH 


“Meaningless! Meaningless!” says the Teacher. 
“Utterly meaningless! Everything is meaningless.” 
What is twisted cannot be straightened; 
what is lacking cannot be counted. 


— Ecclesiastes 1:2,15. 
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Introduction 


Enumerative combinatorics is the mathematical theory of counting. How many ways can we 
deal a thirteen-card bridge hand that has three face cards and is void in clubs? How many 
functions map a ten element set onto a seven element set? How many rearrangements of 
1,2,...,n have no decreasing subsequence of length three? How many ways can we divide an 
assembly of twenty people into five groups? How many invertible functions on {1,2,...,n} 
are equal to their own inverse? How many ways can we seat ten men and five women at 
a circular table so no two women are adjacent? How many ways can we write a positive 
integer n as a sum of positive integers? The techniques of enumerative combinatorics allow 
us to find answers to questions like these. 

This book develops the basic principles of enumeration, placing particular emphasis on 
the role of bijective proofs. To prove that a set S of objects has size n bijectively, one must 
construct an explicit one-to-one correspondence (bijection) from S onto the set {1,2,...,n}. 
More generally, one can prove that two sets A and B have the same size by exhibiting a 
bijection between A and B. For example, fix n > 1 and let A be the set of all strings 
W1W2°++Wen consisting of n left parentheses and n right parentheses that are balanced 
(every left parenthesis can be matched to a right parenthesis later in the sequence). Let B 


be the set of all arrays 
Yl Y2 Sete Yn 
Z1 r) eee Zn 


such that every number in {1,2,...,2n} appears once in the array, yi < yo <-+-: < Yn, 
Zy< 22 <+++ < Zp, and y; < z; for every 7. The sets A and B seem quite different at first 
glance. Yet, we can prove that A and B have the same cardinality by means of the following 
bijection. Given w = wyw2--- Wan € A, let yr < yo < +++ < Yn be the positions of the left 
parentheses in w (taken in increasing order), and let 21 < zg < ++: < Z, be the positions of 
the right parentheses in w (in increasing order). For example, the string (() 0) ((Q)) O 


in A maps to the array 
124 7 8 9 18 
3.5 6 10 11 12 14 )° 


One may check that the requirement y; < z; for all i is equivalent to the fact that w is a 
balanced string of parentheses. The string w is uniquely determined by the array of y;’s and 
z,’8, and every such array arises from a suitable choice of w € A. Thus we have defined the 
required one-to-one correspondence between A and B. We now know that the sets A and 
B have the same size, although we have not yet determined what that size is! 

Bijective proofs, while elegant, can be very difficult to discover. For example, let C be 
the set of rearrangements of 1,2,...,7 that have no decreasing subsequence of length three. 
There is a remarkable bijection between the set B (defined above) and the set C. But the 
reader may wish to defer a search for such a bijection until reading §12.11. 

Luckily, the field of enumerative combinatorics contains a whole arsenal of techniques 
to help us solve complicated enumeration problems. Besides bijections, some of these tech- 
niques include recursions, generating functions, group actions, inclusion-exclusion formulas, 
linear algebra, probabilistic methods, symmetric polynomials, and more. We end this intro- 
duction by describing several challenging enumeration problems that can be solved using 
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these more advanced methods. These problems, and the combinatorial technology needed 
to solve them, will be discussed at greater length later in the text. 


Standard Tableaux 


Suppose we are given a diagram D consisting of a number of rows of boxes, left-justified, 
with each row no longer than the one above it. For example, consider the diagram: 


Let n be the total number of boxes in the diagram. A standard tableau of shape D is a filling 
of the boxes in D with the numbers 1, 2,..., (used once each) so that every row forms an 
increasing sequence (reading left to right), and every column forms an increasing sequence 
(reading top to bottom). For example, here are three standard tableaux of shape D, where 
D is the diagram pictured above: 


Question: Given a diagram D of n cells, how many standard tableaux of shape D are there? 

There is a truly amazing answer to this counting problem, known as the hook-length 
formula. To state it, we need to define hooks and hook-lengths. The hook of a box b ina 
diagram D consists of all boxes to the right of b in its row, all boxes below 6 in its column, 
and box 0 itself. The hook-length of b, denoted h(b), is the number of boxes in the hook of 
b. For example, if 6 is the first box in the second row of D, then the hook of b consists of 
the marked boxes in the following picture: 


Hook-Length Formula: The number of standard tableaux of shape D is n! divided by the 
product of the hook-lengths of all the boxes in D. 
For the diagram D in our example, the formula says there are exactly 


9! 


> TT = 216 
7°5°-2-1-4-2-3-1-1 


standard tableaux of shape D. Observe that the set B of 2 x n arrays (discussed above) can 
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also be enumerated with the aid of the hook-length formula. In this case, the diagram D 
consists of two rows of length n. The hook-lengths for boxes in the top row are n+1,n,n— 


1,...,2, while the hook-lengths in the bottom row are n,n —1,...,1. Since there are 2n 
boxes in all, the hook-length formula asserts that 
(2n)! (2n)! 
| By = es 


(n+ 1)n(n—1)----2-n(n—1)---1) (nt Dnt 


The fraction on the right side is an integer called the nth Catalan number. Since we previ- 
ously displayed a bijection between B and A (the set of strings of balanced parentheses), 
we conclude that the size of A is also given by a Catalan number. As we will see, many 
different types of combinatorial structures are counted by the Catalan numbers. 

How is the hook-length formula proved? Many proofs of this formula have been found 
since it was originally discovered in 1954. There are algebraic proofs, probabilistic proofs, 
combinatorial proofs, and (relatively recently) fully bijective proofs of this formula. Here we 
discuss a flawed probabilistic argument that gives a little intuition for how the mysterious 
hook-length formula arises. Suppose we choose a random filling F' of the boxes of D with 
the integers 1,2,...,n. What is the probability that this filling will actually be a standard 
tableau? We remark that the filling is standard if and only if for every box b in D, the 
entry in b is the smallest number in the hook of b. Since any of the boxes in the hook 
is equally likely to contain the smallest value, we see that the probability of this event is 
1/h(b). Multiplying these probabilities together would give 1/[],<p h(b) as the probability 
that the random filling we chose is a standard tableau. Since the total number of possible 
fillings is n! (cf. Chapter 1), this leads us to the formula n!/[],<p h(b) for the number of 
standard tableaux of shape D. 

Unfortunately, the preceding argument contains a fatal error. The events “the entry 
in box 6 is the smallest in the hook of 6,” for various choices of b, are not necessarily 
independent (see §1.14). Thus we cannot find the probability that all these events occur by 
multiplying together the probabilities of each individual event. Nevertheless, remarkably, 
the final answer obtained by making this erroneous independence assumption turns out to 
be correct! This fact can be justified by a more subtle probabilistic argument due to Greene, 
Nijenhuis, and Wilf [62]. We describe this argument in §12.10. 
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Rook Placements 


A rook is a chess piece that can travel any number of squares along its current row or column 
in a single move. We say that the rook attacks all the squares in its row and column. How 
many ways can we place eight rooks on an ordinary 8 x 8 chessboard so that no two rooks 
attack one another? The answer is 8! = 40,320. More generally, we can show that there 
are n! ways to place n non-attacking rooks on an n x n chessboard. To see this, first note 
that there must be exactly one rook in each of the n rows. The rook in the top row can 
occupy any of the n columns. The rook in the next row can occupy any of the n— 1 columns 
not attacked by the first rook; then there are n — 2 available columns for the next rook, 
and so on. By the product rule (discussed in Chapter 1), the total number of placements is 
therefore n x (n— 1) x (n— 2) x---x L=nil. 

Now consider an (n + 1) x (n + 1) chessboard with a bishop occupying the upper-left 
corner square. (A bishop is a chess piece that attacks all squares that can be reached from 
its current square by moving in a straight line northeast, northwest, southeast, or southwest 
along a diagonal of the chessboard.) Question: How many ways can we place n rooks on 


XX 


this chessboard so that no two pieces attack one another? An example of such a placement 
on a standard chessboard (n + 1 = 8, so n = 7) is shown below: 


It turns out that the number of non-attacking placements is the closest integer to n!/e. Here, 
e is the famous constant e = S779 1/k! © 2.718281828 that appears throughout the subject 
of calculus. When n = 7, the number of placements is 1854 (note 7!/e = 1854.112---). 

This answer follows from the inclusion-exclusion formulas to be discussed in Chapter 4. 
We sketch the derivation now to indicate how the number e appears. First, there are n! 
ways to place the n rooks on the board so that no two rooks attack each other, and no rook 
occupies the top row or the leftmost column (lest a rook attack the bishop). However, we 
have counted many configurations in which one or more rooks occupy the diagonal attacked 
by the bishop. To correct for this, we will subtract a term that accounts for configurations of 
this kind. We can build such a configuration by placing a rook in row 7, column 7, for some 
i between 2 and n+ 1, and then placing the remaining rooks in different rows and columns 
in (n — 1)! ways. So, presumably, we should subtract n x (n — 1)! = n! from our original 
count of n!. But now our answer is zero! The trouble is that our subtracted term over-counts 
those configurations in which two or more rooks are attacked by the bishop. A naive count 
leads to the conclusion that there are n(n — 2)! = n!/2! such configurations, but this 
figure over-counts configurations with three or more rooks on the main diagonal. Thus we 
are led to a formula (called an inclusion-exclusion formula) in which we alternately add and 
subtract various terms to correct for all the over-counting. In the present situation, the final 
answer turns out to be 


n! — nl + n!/2! — n!/3! + nl/4! — nl/5! +--+ + (-1)?nl/n! = nl So (-1)*/kl. 
k=0 
Next, recall from calculus that e” = 0°, x*/k! for all real x. In particular, taking x = —1, 
we have 
1 co 
“hs = =1-—141/2!—1/3!+1/4!-1/5!+---= S$ (-1)*/kl. 
e - / / j / 2 y*/ 


We see that the combinatorial formula stated above consists of the first n + 1 terms in 
the infinite series for n!/e. It can be shown (84.5) that the “tail” of this series (namely 
Wrens (—1)*n!/k!) is always less than 0.5 in absolute value. Thus, rounding n!/e to the 
nearest integer will produce the desired answer. 

Another interesting combinatorial problem arises by comparing non-attacking rook 
placements on two boards of different shapes. For instance, consider the two generalized 
chessboards shown here: 
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One can check that for every k > 1, the number of ways to place k non-attacking rooks 
on the first board is the same as the number of ways to place k non-attacking rooks on the 
second board. We say that two boards are rook-equivalent whenever this property holds. It 
turns out that an n x n board is always rook-equivalent to a board with successive row 
lengths 2n — 1,2n — 3,...,5,3,1. More generally, there is a simple criterion for deciding 
whether two boards “of partition shape” are rook-equivalent. We will present this criterion 
in §12.3. 


( 
Tilings 


Now we turn to yet another problem involving chessboards. A domino is a rectangular object 
that can cover two horizontally or vertically adjacent squares on a chessboard. A tiling of a 
board is a covering of the board with dominos such that each square is covered by exactly 
one domino. For example, here is one possible tiling of a standard 8 x 8 chessboard: 


Question: Given a board of dimensions mxn, how many ways can we tile it with dominos? 
This question may seem unfathomably difficult, so let us first consider the special case where 
m = 2. In this case, we are tiling a 2 x n region with dominos. Let f, be the number of 
such tilings, for n = 0,1,2,.... One can see by drawing pictures that 


fo fi 1, fo 2, fs 3, fa 5, fs 8, fe 13,...- 


The reader may recognize these numbers as being the start of the famous Fibonacci sequence. 
This sequence is defined recursively by letting Fo = Fy = 1 and Fy, = Fy-1 + Fy-2 for all 
n > 2. Now, a routine counting argument can be used to prove that the tiling numbers 
fn satisfy the same recursive formula fp = fr—1 + fn—2. (To see this, note that a 2 x n 
tiling either ends with one vertical domino or two stacked horizontal dominos. Removing 
this part of the tiling either leaves a 2 x (n— 1) tiling counted by f,-1 or a 2 x (n— 2) tiling 
counted by fn—2.) Since the sequences (f,,) and (F;,) satisfy the same recursion and initial 
conditions, they must agree for all n. 

Now, what about the original tiling problem? Since the area of a tiled board must be 
even, there are no tilings unless at least one of the dimensions of the board is even. For 
boards satisfying this condition, Kasteleyn [75] and Fisher and Temperley [36] proved the 
following amazing result. The number of domino tilings of an m x n chessboard (with m 
even) is exactly equal to 


m/2 n ; 
mn/2 Ji : An 
2 ILI] cost (7) + cost (A . 


j=l k=1 


The formula is especially striking since the individual factors in the product are transcen- 
dental numbers, yet the product of all these factors is a positive integer! When m = n = 8, 
the formula reveals that the number of domino tilings of a standard chessboard is 12,988,816. 
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The proof of the formula involves Pfaffians, which are quantities analogous to determinants 
that arise in the study of skew-symmetric matrices. For details, see §12.12 and §12.13. 


Notes 


Various proofs of the hook-length formula may be found in [42, 45, 62, 101, 108]. Treatments 
of rook-equivalence and other aspects of rook theory appear in [41, 55, 56, 74]. The domino 
tiling formula was proved by Kasteleyn [75] and discovered independently by Fisher and 
Temperley [36]. 
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Basic Counting 


This chapter develops the basic counting techniques that form the foundation of enumerative 
combinatorics. We apply these techniques to study fundamental combinatorial structures 
such as words, permutations, subsets, functions, and lattice paths. The end of the chapter 
gives some applications of combinatorics to probability theory. 
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1.1 Review of Set Theory 


We assume the reader is familiar with elementary aspects of logic and set theory, including 
proofs by induction. This material may be found in texts such as [34, 126]. Table 1.1 reviews 
the notation we will use from set theory. The word iff is defined to mean “if and only if.” 


TABLE 1.1 
Review of notation from set theory. 
Concept Symbol Meaning 
membership res x is an element of the set S. 
set-building {x:P(x)} | ye {a: P(x)} iff P(y) is true. 
subset ACB For all , x € A implies x € B. 
set equality A=B For allz, xe Aiffee B. 
empty set ) For all vz, x ¢ 0. 
cardinality |A| =n The set A has exactly n members. 
union AUB cE AUBiffxeeAorre B. 
intersection ANB cE ANBiffee Aandzre B. 
set. difference An~B cE An~BiffeecAandc¢ B. 
ordered pair (a, b) (a,b) = (c,d) iffa =c and b=d. 
Cartesian product AxB Ax B={(a,b):a€ A and bE B}. 
finite union A,U:+:-UAn | @ € Ay U-+-UA, iff x € A; for at least one 7 < n. 
finite intersection | AyN-:-N Ay, | x € AYN:::-N A, iff x € A; for alli <n. 
ordered n-tuple (a1,---,@n) | (@1,.--,@n) = (b1,...,6n) iff a; = b; for L<i<n. 
finite product Ay X+++X An | Ai X +++ & An = {(@1,..-,@n) 2 a; € A; for i < n}. 


We use the notation N = {0,1,2,3,...}, N* = {1,2,3,...}, Z for the set of all integers, 
Q for the set of rational numbers, R for the set of real numbers, and C for the set of complex 
numbers. Informally, the notation |A| = n means that A is a set consisting of n elements. 
We will give a more formal discussion of cardinality later (§1.6). 

Two sets A and B are disjoint iff AN B = 0). More generally, the sets A1,...,An are 
called pairwise disjoint iff A;M A; = 0 for all i A 7. This means that no two sets in the 
given list overlap one another. 
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1.2 Sum Rule 


The starting point for enumerative combinatorics is the following basic fact. 

1.1. Counting Principle. If A and B are finite disjoint sets, then |AU B| = |A| + |B]. 
The requirement that A and B be disjoint is certainly necessary. For example, if A = 

{1, 2,3} and B = {3,5}, then |AU B| = 4, while |A]+|B] = 342 = 5. We will give a formal 

proof of 1.1 later (see 1.32). For now, let us deduce some consequences of this counting 

principle. 

1.2. Sum Rule. If A;,..., A, are pairwise disjoint finite sets, then 


|AyU---UAm,| = |Ar] +--+ + |Am|- 


Proof. We use induction on m. The case m = 1 is immediate, while the case m = 2 is true 
by 1.1. For m > 2, assume the result is known for m—1 sets. In 1.1, let A= Ay U---UAm_1 
and B = A,,. By induction hypothesis, 


|A] = |Ai| +--+ + |Am—al- 


Since A,, does not intersect any A; with j < m, we see that A and B are disjoint. So 1.1 
gives 
|A1 U---UAm| = |AU Bl = [Al + |B] = [Aa] +--+ + [Amal + |Aml- Oo 


1.3. Difference Rule. If S$ and T are finite sets such that T C S, then |S ~ T| = |S|—|T]. 


Proof. The set S is the union of the disjoint sets T and S ~ T. Therefore, 1.1 gives |.S| = 
T|+ |S ~ T|. Subtracting the finite quantity |T'| from both sides gives the result. Oo 


We can generalize 1.1 to the case where the two sets in question are not disjoint, as 
follows. 


1.4. Binary Union Rule. If A and B are arbitrary finite sets, then |AU B| = |A|+ |B] — 
AN Bi. 


Proof. Note that A is the disjoint union of A ~ B and AN B; B is the disjoint union of 
Bw~Aand ANB; and AU B is the disjoint union of A ~ B, B ~ A, and AN B. See 
Figure 1.1. Applying the sum rule repeatedly, we see that 


|AJ=|A~ B|+|ANB|; |B) =|B~ A|+|ANB|; |AUB|=|A~ B\+|B~ Al+|ANB. 


Using the first two equations to eliminate |A ~ B| and |B ~ A| in the third equation, we 
obtain the desired result. O 


The sum rule can also be extended to a formula for |A; U---UA,|, where Aj,..., An are 
arbitrary (not necessarily pairwise disjoint) finite sets. This formula is called the inclusion- 
exclusion formula; we will study it later (Chapter 4). 
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FIGURE 1.1 
Proof of the binary union rule. 
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1.3. Product Rule 


We can use the sum rule to compute the size of the Cartesian product of finite sets. 


1.5. Product Rule for Sets. Suppose S1,...,.5, are finite sets with |S;| = n; for 1 < 
1<k. Then 
[Sy x Sox +++ x S| = ning-++ Mg. 


Proof. We proceed by induction on k. There is nothing to prove when & = 1. Consider the 
case k = 2. Let Sp = {21,%2,...,2n,}. The set $1 x Sq is the disjoint union of the ng sets 
Sy x {a1}, Si x {xo}, ..., S1 x {@n,}. Each of these sets has cardinality |S,| = 1. So, by 
the sum rule, 


n2 n2 
[Sy x S4| = S- [Sy x {x;}| = Soni = nNyN2Q. 
i=1 i=1 


Next, let k > 2 and assume that the result is already known for products of k — 1 sets. We 
can regard $ x So x --- x Sx as the Cartesian product A x B, where A = S, x +--+ x Sp_-1 
and B = Sx. By induction, |A| = nyn2g---nz_1. By the k = 2 case, 


[Sy x So Ror KR Si| = |A x B| = | A| . |B| = (ning : ++ Mp1) Nk. 
This completes the induction step. O 


1.6. Example: License Plates. A California license plate consists of a digit, followed by 
three uppercase letters, followed by three more digits. Formally, we can view a license plate 
as an element of the set S= Dx LxLxtLxDx Dx D, where D = {0,1,2,...,9} and 
L={A,B,C,...,Z}. Thus, 


|S] = 10 x 26 x 26 x 26 x 10 x 10 x 10 = 175, 760, 000. 


1.7. Example: Phone Numbers. A phone number is a ten-digit sequence such that the 
first digit is not zero or one, while the second digit must be zero or one. Formally, we can 
view a phone number as an element of the set S = {2,3,...,9} x {0,1} x D®, where the 
notation D® denotes the Cartesian product of 8 copies of the set D = {0,1,...,9}. The 
number of phone numbers is 


|S| =8 x 2x 108 = 1.6 billion. 
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(To allow for more phone numbers, the restriction on the second digit of the area code was 
removed years ago.) 


We will often be interested in finding the cardinality of a finite set S whose members are 
“structured objects.” Frequently, we will be able to build up each object in S by making a 
sequence of choices. The next counting principle tells us how to compute |S| in this situation. 


1.8. Product Rule. Suppose each object x in a set S can be uniquely constructed by 
making a sequence of k choices. Suppose the first choice can be made in n 1 ways; the 
second choice can be made in nz ways (regardless of what the first choice was); and so 
on. In general, we suppose that the ith choice can be made in n; ways, regardless of what 
happened in the first ¢ — 1 choices, for all i < k. Then 


|S| = ning-+- nN. 
The product rule is a consequence of 1.5, as we will explain in 1.34. 


1.9. Example: Fraternity and Sorority Names. The name of a fraternity or sorority 
consists of any sequence of two or three uppercase Greek letters. (The Greek alphabet has 
24 letters.) How many possible names are there? The set S' of all such names is the disjoint 
union of Sj and S3, where S; is the set of names of length k. Using the sum rule, 


[S| = |S2] + [Ss]. 


We can calculate |.S2| using the product rule. We build a typical word in S) by choosing 
the first letter (24 ways), then choosing the second letter (24 ways). By the product rule, 
|S2| = 247. Similarly, |,S3| = 24°, so |S| = 24? + 243 = 14, 400. Note that we cannot directly 
use the product rule to calculate |S|, since the number of choices in a given application of 
the product rule must be fixed. 


1.10. Example. How many three-digit odd numbers contain the digit 2 but not the digit 
5? Let X be the set of all such numbers. We can write X as the disjoint union of three sets 
A, B, and C, where A consists of numbers in X with first and second digit 2, B consists 
of numbers in X with first digit 2 and second digit not 2, and C’' consists of numbers in 
X with second digit 2 and first digit not 2. To build a number in C, we choose the digits 
from left to right. There are seven choices for the first digit (we must avoid 0, 2, and 5), 
one choice for the second digit (it must be 2), and four choices for the third digit (which is 
odd and unequal to 5). By the product rule, |C| = 7-1-4 = 28. Similar reasoning shows 
that |A| = 4 and |B] = 1-8-4 = 32. Therefore, |X| = |A| + |B] +|C| = 64. 
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1.4 Words, Permutations, and Subsets 


1.11. Definition: Words. Let A be a finite set. A word over the alphabet A is a sequence 
w= wW1Ww2--:Wr, where each w; € A and k > 0. The length of w = wiw2--- we is k. Two 
words w = wywW2:-- wr, and z = 2122°+: 2m are equal iff k =m and wu; = z; for 1 <i<k. 


1.12. Example. Let A = {a,b,c,...,z} be the set of 26 lowercase letters in the English 
alphabet. Then stop, opts, and stoops are distinct words (of lengths 4, 4, and 6, respectively). 
If A = {0,1}, the 8 words of length 3 over A are 


000, 001, 010, O11, 100, 101, 110, 111. 
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There is exactly one word of length zero, called the empty word. It is sometimes denoted 
by the special symbols - or e. 


1.13. Theorem: Enumeration of Words. If A is an n-letter alphabet and k > 0, then 
there are n* words of length k over A. 


Proof. We can uniquely construct a typical word w = w,w2---w,z by a sequence of choices. 
First, choose w; € A to be any of the n letters in A. Second, choose w2 € A in any of n 
ways. Continue similarly, choosing w; € A in any of n ways for 1 <i < k. By the product 
rule, the number of words is n x n x --- x n (k factors), which is n*. Note that the empty 
word is the unique word of length 0 over A, so our formula holds for k = 0 also. O 


1.14. Definition: Permutations. Let A be an n-element set. A permutation of A is a 
word w = w,W2-+:W, in which each letter of A appears exactly once. For example, the 6 
permutations of A = {xz, y, z} are 


LYZ, LZY, YLZ, Yeu, xy, zy. 

1.15. Definition: Factorials. For each integer n > 1, n-factorial is 
n!=nx (n—1)x (n-—2)x---x3x2x1, 

which is the product of the first n positive integers. We also define 0! = 1. 


1.16. Theorem: Enumeration of Permutations. There are n! permutations of an n- 
letter alphabet A. 


Proof. Build a typical permutation w = wyw2--+ wp, of A by making n choices. First, choose 
wy, to be any of the n letters of A. Second, choose w2 to be any of the n — 1 letters of A 
different from w,. Third, choose w3 to be any of the n — 2 letters of A different from 
w, and we. Proceed similarly; at the nth stage, choose wy, to be the unique letter of A 
that is different from w1, w2,...,Wn—1. By the product rule, the number of permutations is 
nx (n—1)x---x 1=nl!. The result also holds when n = 0. Oo 


1.17. Definition: k-Permutations. Let A be an n-element set. A k-permutation of A 
is a word w = wy iw2:--w, consisting of k distinct letters in A. For example, the twelve 
2-permutations of A = {a, b,c, d} are 


ab, ac, ad, ba, bc, bd, ca, cb, cd, da, db, dc. 
An n-permutation of A is the same as a permutation of A. 


1.18. Theorem: Enumeration of k-Permutations. Suppose A is an n-letter alphabet. 
For 0 < k <n, the number of k-permutations of A is 


ees oe a aed ne a OPED ST 


For k > n, there are no k-permutations of A. 


Proof. Build a typical k-permutation w = wiw2:--w, of A by making & choices. First, 
choose wy, to be any of the n letters of A. Second, choose w2 to be any of the n — 1 letters 
of A different from w;. Continue similarly. When we choose w; (where 1 < i < k), we have 
already used the i — 1 distinct letters w1,w2,...,wj-1. Since A has n letters, there are 
n—(i—1) =n—i+1 choices available at stage 7. In particular, for the kth and final choice, 
there are n—k-+1 ways to choose w,. By the product rule, the number of k-permutations is 
The. (n —(t-1)) =n(n—-1)---(n—k+1). Multiplying this expression by (n—k)!/(n—k)!, 
we obtain the product of the integers 1 through n in the numerator, which is n!. Thus the 
answer is also given by the formula n!/(n — k)!. O 
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1.19. Definition: Power Set. For any set S, the power set P(S) is the set of all subsets of 
S. Thus, T € P(S) iff T C S. For example, if S = {2,5,7}, then P(S) is the eight-element 
set 


{0, {2}, {5}, {7}, {2,5}, {2 7}, {5, 7}, {2, 5, 7h}. 


1.20. Theorem: Cardinality of Power Sets. An n-element set has 2” subsets. In other 
words, if |S| =n, then |P(S)| = 2”. 


Proof. Suppose S = {x1,...,2%n} is an n-element set. We can build a typical subset T of 
S by making a sequence of n choices. First, decide whether x, € T or x; ¢ T. This binary 
decision can be made in two ways. Second, decide whether x2 € T or x2 ¢ T; again there 
are two possibilities. Continue similarly; decide in the ith choice whether x; € T or x; ¢ T 
(two possibilities). This sequence of choices uniquely determines which x,;’s belong to T. 
Since T is a subset of $, this information uniquely determines the set T. By the product 
rule, the number of subsets is 2 x 2 x --- x 2 (n factors), which is 2”. Oo 
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1.5 Functions 


This section reviews the definitions of functions, injections, surjections, and bijections, which 
should already be familiar to the reader. We also enumerate the number of functions, injec- 
tions, and bijections between two given finite sets. The enumeration of surjections is more 
subtle, and will be discussed later (§2.10). 


1.21. Definition: Functions. Formally, a function f from X to Y is an ordered triple 
(X, Y,G), where G is a subset of X x Y such that for each « € X there is exactly one y € Y 
with (x,y) € G. X is the domain of f, Y is the codomain of f, and G is the graph of f. We 
write y = f(x) iff (a,y) € G, and we write f : X — Y to signify that f is a function from 
X to Y. Let *Y denote the set of all functions from X to Y. 


Informally, we think of a function f as consisting of a rule that maps each « € X toa 
unique value f(z) € Y. When X and Y are finite sets, it is convenient to visualize f by 
an arrow diagram. We obtain this diagram by drawing a dot for each element of X and Y, 
and drawing an arrow from x to y whenever y = f(x). The definition of a function requires 
that each x € X have exactly one arrow emanating from it, and the arrow must point to an 
element of Y. On the other hand, an element y € Y may have zero, one, or more than one 
arrow hitting it. Figures 1.2, 1.3, 1.4, and 1.5 depict the arrow diagrams for some functions. 


1.22. Theorem: Enumeration of Functions. Suppose X = {z1,...,2,} is an n-element 
set and Y = {y1,..-,Ym} is an m-element set. There are m” functions from X to Y. In 
other words, |*Y| = |Y|!*!. 


Proof. To build a typical function f € *Y, we make a sequence of n choices that uniquely 
determine the graph G of f. First, we choose f(x1) to be any of the m elements of Y. 
Second, we choose f(x2) to be any of the m elements of Y. Similarly, for each i < n, we 
choose f(2;) to be any of the m elements of Y. By the product rule, the number of functions 
we can build ism x mx +--+ x m (n factors), which is m”. Oo 


1.23. Definition: Injections. A function g : X — Y is an injection iff for all x, 2’ € X, 
x #2’ implies g(x) 4 g(a’). Injective functions are also called one-to-one functions. 
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XxX Y 
a. 
FIGURE 1.2 
A function f : {1,2,3,4}— {a, b,c, d}. 
XxX Y 


FIGURE 1.3 
An injective function g : {1,2,3} — {a,b,c, d}. 


In the arrow diagram for an injective function, every y € Y has at most one arrow 
entering it. For example, the function f in Figure 1.2 is not injective, while the function g 
in Figure 1.3 is injective. 


1.24. Theorem: Enumeration of Injections. Suppose X = {z1,...,2n} is an n-element 
set and Y = {y1,...,Ym} is an m-element set. If n < m, the number of injections from X 
into Y is m(m—1)(m-—2)---(m—n+1)=m!/(m—n)!. If n > m, there are no injections 
from X to Y. 


Proof. Assume first that n < m. As above, we construct a typical injection g : X — Y by 
choosing the n function values g(x;), for 1 <7 <n. For each i < n, we choose g(x;) to be an 
element of Y distinct from the elements g(x1),...,g(%i—-1) already chosen. Since the latter 
elements are pairwise distinct, we see that there are m— (t— 1) = m—i+1 alternatives for 
g(a;), no matter what happened in the first ¢ — 1 choices. By the product rule, the number 
of injections is m(m—1)---(m—n+1)=m!/(m—n)!. 

On the other hand, suppose n > m. Try to build an injection g by choosing the values 
g(a1),9(a2),... as before. When we try to choose g(am+1), there are no elements of Y 
distinct from the previously chosen elements g(x1),...,g(%m). Since it is impossible to 
complete the construction of g, there are no injections from X to Y in this situation. O 


1.25. Definition: Surjections. A function h: X — Y is a surjection iff for every y € Y 
there exists « € X with y = f(x). Surjective functions are also said to be onto or to map 
onto the codomain Y. 


8 Bijective Combinatorics 


XxX Y 
FIGURE 1.4 
A surjective function h : {1,2,3,4} — {a,b,c}. 

XxX Y 


FIGURE 1.5 
A bijective function f : {1,2,3,4} > {a,b,c, d}. 


In the arrow diagram for a surjective function, every y € Y has at least one arrow 
entering it. For example, the functions f and g in Figures 1.2 and 1.3 are not surjective, 
while the function h in Figure 1.4 is surjective. Note that h is not injective. Counting 
surjections is harder than counting other classes of functions, so we defer discussion of this 
problem to a later section (§2.10). 


1.26. Definition: Bijections. A function f : X — Y is a bijection iff f is both injective 
and surjective iff for every y € Y there exists a unique « € X with y = f(z). 


In the arrow diagram for a bijective function, every y € Y has exactly one arrow entering 
it. For example, the functions in Figures 1.2 through 1.4 are not bijective, while the function 
f in Figure 1.5 is bijective. 


1.27. Theorem: Injectivity vs. Surjectivity. Suppose f : X — Y is a function. If X 
and Y are finite sets with the same number of elements, then f is injective iff f is surjective. 


Proof. Suppose X and Y both have n elements, and write X = {21,...,2,}. Assume that 
f : X —Y is injective. Then the set T = {f(x1),..., f(@n)} is a subset of Y consisting of 
n distinct elements. Since Y has n elements, this subset must be all of Y. This means that 
every y € Y has the form f(x;) for some x; € X, so that f is surjective. 

Conversely, assume that f : X — Y is not injective. Then there exist 1 £ j with 
f (vi) = f(a;). It follows that the set T = {f(r1),..., f(@n)} contains fewer than n elements, 
since the displayed list of members of T contains at least one duplicate. Thus T is a proper 
subset of Y. Letting y be any element of Y ~ T, we see that y does not have the form f(z) 
for any « € X. Therefore f is not surjective. O 
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The previous result does not extend to infinite sets, as shown by the following examples. 
The function f : N — N defined by f(n) = +1 is injective but not surjective. The function 
g: N—N defined by g(2k) = g(2k+1) = k for all k > 0 is surjective but not injective. The 
function exp : R — R defined by exp(a) = e* is injective but not surjective. The function 
h:R-R defined by h(x) = x(x — 1)(a + 1) is surjective but not injective. 


1.28. Theorem: Enumeration of Bijections. Suppose X and Y are two n-element sets. 
Then there are n! bijections from X to Y. 


Proof. By 1.27, a function f : X — Y is injective iff f is surjective. Therefore, under the 
assumption that |X| = |Y| =n, f is injective iff f is bijective. We have already seen that 
the number of injections from X to Y is n!/(n —n)! = n!. The result follows. Oo 


If X is an n-element set and Y is an m-element set and m # n, there are no bijections 
from X to Y (cf. the next section). 


1.29. Remark. The reader may note the similarity between the formulas obtained here 
for functions and the formulas obtained earlier for words and permutations. This is not a 
coincidence. Indeed, we can formally define a word w,w2---w, over an alphabet A as the 
function w : {1,2,...,k} — A defined by w(t) = w;. The number of such words (functions) 
is |A|*. The word w,w2--+w, is a k-permutation of A iff the w;’s are all distinct iff w is an 
injective function. The word w,w2---w, is a permutation of A iff w is a bijective function. 
Finally, note that w is surjective iff every letter in the alphabet A occurs among the letters 


1.6 Bijections, Cardinality, and Counting 


Bijections play a critical role in the theory of counting. Indeed, the very definition of cardi- 
nality is formulated in terms of bijections. In everyday life, we count the number of objects in 
a finite set S' by pointing to each object in the set in turn and saying “one,” “two,” “three,” 
etc. In essence, we are setting up a bijection between S and some set {1,2,...,n} of natural 
numbers. This leads to the following definition, which provides a rigorous foundation for 
the informal notion of cardinality that we have used up to this point. 


1.30. Definition: Cardinality. For any set A and any integer n > 1, we write |A] = n 
iff there exists a bijection f : A > {1,2,...,n}. We write |A| = 0 iff A = @. For any sets 
A and B, we write |A| = |B| iff there exists a bijection f : A — B. We write |A| < |B| iff 
there exists an injection g: A— B. 


These definitions apply to infinite sets as well as finite sets, although we shall be mainly 
interested in finite sets. In the general case, one can prove the Schréder-Bernstein Theorem: 
|A| < |B] and |B] < |A| imply |A| = |B] (see [125, p. 29] or 1.156 for a proof.) If A is 
nonempty and the axiom of choice is assumed, then |A| < |B| is equivalent to the existence 
of a surjection h : B — A. These properties are intuitively evident in the case of finite sets. 
For more discussion of the theory of cardinality for infinite sets, see [66] or [95]. 

Recall that if f: X — Y andg:Y — Z are functions, the composition of g and f is the 
function go f : X — Z defined by (go f)(x) = g(f(x)) for « € X. We assume the reader is 
familiar with the following theorem, so we omit its proof. 


1.31. Theorem: Properties of Bijections. Let X, Y, Z be any sets. (a) The identity 
map idx : X — X, defined by idx(#) = z for all x € X, is a bijection. Hence, |X| = |X]. 
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(b) A function f : X — Y is bijective iff there exists a function f’ : Y — X such that 
f'o f = idx and fo f’ = idy. If such an f’ exists, it is unique; we call it the two-sided 
inverse of f and denote it by f~!. This inverse is also a bijection, and (f~!)~! = f. Hence, 
|X| = |Y| implies |Y| = |X]. (c) The composition of two bijections is a bijection. Hence, 
|X| = |Y| and |Y| = |Z| implies |X| = |Z]. 


The definition of cardinality can be used to prove the basic counting principle 1.1. 
1.32. Theorem. If |A| =n, |B] =m, and AN B= 9, then |AU BJ =n+m. 


Proof. The assumption |A| = n means that there is a bijection f : A — {1,2,...,n}. The 
assumption |B| = m means that there is a bijection g : B > {1,2,...,m}. Define a function 
h: AUB = {1,2,...,n +m} by setting 


f(a) ifa eA; 
n@) ={ g(a)+n ifve B. 


The assumption that 4M B = 0) is needed to ensure that h is a well-defined (single-valued) 
function. Observe that h does map into the required codomain {1,2,...,n +m}. To see 
that h is a bijection, we display a two-sided inverse h’ : {1,2,...,n+m}—> AUB. We 


define 
h'(i) = f"@ iia <9 
giGi-n) ifn+1<i<nt+m. 
A routine case analysis verifies that ho h’ and h’o h are identity maps. oO 


The product rule 1.8 can be phrased more formally in terms of bijections. 


1.33. Formal Product Rule. Suppose there is a bijection 
f :4{1,2,...,ni} x {1,2,...,na} «+--+ x {1,2,...,ng} - S. 
Then |$| = nyng--+ ng. 


Proof. S has the same cardinality as the product set {1,2,...,mi} x --- x {1,2,...,nx}, 
thanks to the bijection f. So the result follows from 1.5. O 


1.34. Remark. Let us compare the formal product rule 1.33 to the informal version of 
the product rule given earlier (1.8). In informal applications of the product rule, we “build” 
objects in a set S' by making a sequence of k choices, where there are n; ways to make the ith 
choice. The input to the bijection f in the formal product rule is a k-tuple (ci1,..., cx) where 
1<«cq <n, for alli < k. Intuitively, c; records which choice was made at the ith stage. In 
practice, the map f is described as an algorithm that tells us how to combine the choices 
c; to build an object in S. The key point in the intuitive product rule is that each object 
in S can be constructed in exactly one way by making suitable choices. This corresponds 
to the requirement in the formal product rule that f be a bijection onto S. Most erroneous 
applications of the intuitive product rule occur when the underlying “construction map” f 
is not bijective (a point that is seldom checked explicitly when using the product rule). 


1.35. Example. How many 4-letter words contain at least one E? One might try to con- 
struct such words by choosing a position that contains the E (4 choices), then filling the 
remaining positions from left to right with arbitrary letters (26 choices for each position). 
The product rule would then give 4 x 26? = 70,304 as the answer. However, this answer is 
incorrect. Our choice sequence implicitly defines a function 


5 4{1,2,3)4) X {12,4525 26) SX, 
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where X is the set of words under consideration. For example, f (3,3, 2,26) = CBEZ. Our 
counting argument is flawed because the function f is surjective but not bijective. For 
instance, f(1,1,5,1) = EAEA = f(3,5,1,1). 

To obtain the correct answer, one can combine the product rule and the difference rule. 
There are 26+ words of length 4, and there are 254 such words that do not contain the letter 
E. So the true answer is 26+ — 254 = 66,351. An alternative argument that is closer to our 
original attempt breaks X into the disjoint union X; U X2 U X3 U X4, where X; is the set 
of four-letter words where the first occurrence of E is at position 7. A modification of the 
argument in the previous paragraph shows that | X;| = 25’~!264-*, so 


|X| = 26% + 25 . 26? + 25? . 26 + 25° = 66, 351. 


1.36. Remark. One can give a bijective proof of the product rule (1.5), just as we gave 
a bijective proof of the sum rule (1.1) in 1.32. These bijective proofs have applications to 
the problems of listing, ranking, and unranking collections of combinatorial objects. These 
topics are discussed in Chapter 5. 
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1.7 Subsets, Binary Words, and Compositions 


A fundamental method for counting a finite set A is to display a bijection between A and 
some other set B whose cardinality is already known. We illustrate this basic principle by 
revisiting the enumeration of subsets and binary words, and enumerating new combinatorial 
objects called compositions. 

Let X = {x,y,z}, and consider the set P(X) of all subsets of X. We define a bijection 
f : P(X) — {0,1} as follows: 


(0) = 000; f({x}) = 100; f({y}) = 010; f({z}) = 001; 
f({x,y}) = 110; f({x,z})= 101; f({y,z}) = O11; f({a,y,2}) = M11. 


These values were computed by the following rule. Given S C X, we set f(S) = wiwow3 
where wi =lifxEeS,w, =O0ifa¢S,wo=1lifye $S,wo=O0ifyZ~S,w3=1ifzes, 
and w3 = 0 if z ¢ S. We see by inspection that f is a bijection. Thus, |P(X)| = |{0,1}3| = 
23 = 8. We now generalize this example to n-element sets. First, we introduce notation that 
will be used frequently throughout the text. 


1.37. Definition: Truth Function. If P is any logical statement, we set y(P) = 1 if P is 
true, and y(P) = 0 if P is false. 


1.38. Theorem: Subsets vs. Binary Words. Let X be an n-element set. For each 
ordering #1,...,2n of the elements of X, there is a bijection f : P(X) — {0,1}”. Therefore, 
|P(X)| = 2”. 


Proof. Given S C X, we define f(S) = wiw2---Wn, where w; = v(x; € S). To see that f 
is a bijection, define f’ : {0,1}" — P(X) by setting 


f'(wiwe aa ‘Wn) = {x; EX: UW;4e= 1}. 
It is immediate that f’ is the two-sided inverse of f. oO 


For example, if X = {1,2,3,4,5,6,7,8} with the usual ordering, then f({2,5,7,8}) = 
01001011 and f~'(10000011) = {1, 7, 8}. 
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1.39. Definition: Compositions. A composition of an integer n > 0 is a sequence a = 
(Q1,Q2,...,Q@%), where each a; is a positive integer and ay +a2+---+az, =n. The number 
of parts of a is k. Let Comp,, be the set of all compositions of n. 


1.40. Example. The sequences (1,3, 1,3,3) and (3, 3,3, 1,1) are two distinct compositions 
of 11 with five parts. The four compositions of 3 are 


(3), (2,1), (1,2), (1,1,1). 


1.41. Theorem: Enumeration of Compositions. For all n > 0, there are 2”~! com- 
positions of n. 


Proof. We define a bijection g : Comp,, > {0,1}"~+. Given a = (a1, Q2,...,a%) € Comp,, 
define 
g(a) = 0% "1092-11. 10%, 


Here, the notation 0% denotes a sequence of j consecutive zeroes, and 0° denotes the empty 
word. For example, g((3, 1,3)) = 001100. Since Seva —1)=n-—k and there are k — 1 
ones, we see that g(a) € {0,1}"~1. Now define g’ : {0,1}"~! — Comp,, as follows. We 
can uniquely write any word w € {0,1}"~! in the form w = 0°:10°1---10° where k > 1, 
each b; > 0, and ene b; = (n-— 1) — (k-—1) = n-—k since there are k — 1 ones. Define 
g'(w) = (by +1, b2 +1,...,b% +1), which is a composition of n. For example, g’(100100) = 
(1, 3,3). One may check that g’ is the two-sided inverse of g, so g is a bijection. It follows 
that. | Comp,\|= 40,4)" 1) = 27-4, oO 


The bijections in the preceding proof are best understood pictorially. We represent an 
integer 7 > 0 as a sequence of 2 unit squares glued together. We visualize a composition 
(a1,...,Q@~%) by drawing the squares for aj,...,@x in a single row, separated by gaps. For 
instance, the composition (1,3,1,3,3) is represented by the picture 


ce a ia a 


We now scan the picture from left to right and record what happens between each two 
successive boxes. If the two boxes in question are glued together, we record a 0; if there is 
a gap between the two boxes, we record a 1. The composition of 11 pictured above maps 
to the word 1001100100 € {0,1}!°. Going the other way, the word 0101000011 € {0,1}1° 
leads first to the picture 


Ds a i 


and then to the composition (2,2,5,1,1). One can check that the pictorial operations just 
described correspond precisely to the maps f and f’ in the proof above. When n = 3, we 
have: 


f((3)) = 00; f((2,1))= 01;  f((1,2)) = 10; f((,1,1)) = 11. 


1.8 Subsets of a Fixed Size 


We turn now to the enumeration of the k-element subsets of an n-element set. For example, 
there are ten 3-element subsets of {a, b, c, d, e}: 


{a, Ue, {a, b, dt, {a, be}, fis €, d}, fa, 6, ra 
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FIGURE 1.6 
The set {a,b,c,d,e} and its 3-element subsets. 


{a,d,e}, {b,c,d}, {bce}, {b,d,e}, {c,d,e}. 


In this example, we present a given set by listing its members between curly braces. This 
notation forces us to list the members of each set in a particular order (alphabetical in 
this case). If we reorder the members of the list, the underlying set does not change. For 
example, the sets A; = {a,c,d} and Ag = {c,d,a} and As = {d,c,a} are all equal. This 
assertion follows from the very definition of set equality: A = B means that for every z, 
x € A iff x € B. In contrast, the ordering of elements in a sequence (or word) definitely 
makes a difference. For instance, the words cad and dac are unequal although they use the 
same three letters. 

To emphasize that the members of a set do not come in any particular order, we often 
picture a finite set as a circle with the members of the set floating around in random 
positions inside the circle. For example, Figure 1.6 depicts the sets mentioned above. 

Suppose we try to enumerate the k-element subsets of a given n-element set using the 
product rule. Recall that the product rule requires us to construct objects by making an 
ordered sequence of choices. We might try to construct a subset by choosing its first element 
in n ways, then its second element in n — 1 ways, etc., which leads to the incorrect answer 
n(n — 1)---(n —k +1). The trouble here is that there is no well-defined “first element” 
of a subset. In fact, our naive construction procedure generates each subset several times, 
once for each possible ordering of its members. There are k! such orderings, so we obtain 
the correct answer by dividing the previous formula by k!. We make this argument more 
precise in the next theorem. 


1.42. Theorem: Enumeration of k-element Subsets. For 0 < k < n, the number of 
k-element subsets of an n-element set is 


n! 


k(n — ky! 


Proof. Fix n and k with 0 < k < n. Let A be an n-element set, and let x denote the 
number of k-element subsets of A. Let S be the set of all k-permutations of A. Recall that 
elements of S are ordered sequences w,w2--: wz, where the w; are distinct elements of A. 
We compute |S| in two ways. First, we have already seen that |S| = n!/(n — k)! by using 
the product rule — we choose w in n ways, then choose we in n — 1 ways, etc., and finally 
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choose wz in n—k+1 ways. On the other hand, here is a second way to construct a typical 
sequence W,W2:--w, in S. Begin by choosing a k-element subset of A in any of x ways. 
Then write down a permutation of this k-element subset in any of k! ways. The result is an 
element of S. By the product rule, 


x-kl=|S|=n!/(n—k)!. 
Solving for x, we obtain the desired formula. O 


1.43. Definition: Binomial Coefficients. For 0 < k < n, the binomial coefficient is 


n n! 
) ie ae aT] 
For k < 0 or k > n, we define (”) = C(n,k) = 0. Thus, for all n > 0 and all k, (7) is the 


n 


i is always an integer. 


number of k-element subsets of an n-element set. In particular, ( 
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1.9 Anagrams 


1.44. Definition: Anagrams. Suppose aj,..., a, are distinct letters from some alphabet 
A and nj1,...,nx% are nonnegative integers. Let R(aj'a}? ---a,") denote the set of all words 
wW = W1W2:::W, that are formed by rearranging n, copies of a1, nz copies of ag, ..., Nx 


copies of ay (so that n =n; +n2+---+ ng). Words in a given set R(aj?---a;") are said 
to be anagrams or rearrangements of one another. 


1.45. Example. 
R(0213) = {00111, 01011, 01101, 01110, 10011, 10101, 10110, 11001, 11010, 11100}; 
R(a'b*c!d°) = {abbe, abcb, acbb, babe, bach, bbac, bbca, beab, beba, cabb, cbab, cbba}. 


1.46. Theorem: Enumeration of Anagrams. Suppose ai,...,a% are distinct letters, 
N1,.-.-,Mk are nonnegative integers, and n = nj +---+ nx. Then 
IR(af ag? ---ap*)| = —_™ 
a;'a a = 
bee k ny!ngq!---+ng! 


Proof. We give two proofs of this result. First Proof: We use a technique similar to that used 
in the proof of 1.42. Define a new alphabet A consisting of n distinct letters by attaching 
distinct numerical superscripts to each copy of the given letters a1,... ax: 


A= {ala ,...,a al) ee eee ceeee al}, 
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Let « = |R(ay'a5? ---a,*)|. Let S be the set of all permutations w of A. We count |.S| in 
two ways. On one hand, we already know that |S| = n! (choose w; in n ways, then w2 
in n — 1 ways, etc.). On the other hand, here is a different method for constructing each 
permutation of A exactly once. First, choose a word v € R(a}'---a;,") in any of w ways. 
Second, attach the superscripts 1 through n; to the n, copies of a; in v in any of n,! ways. 
Third, attach the superscripts 1 through nz to the ng copies of az in v in any of ng! ways. 
Continue similarly; at the last stage, we attach the superscripts 1 through nz to the nz 
copies of a, in v in any of nx! ways. By the product rule, 


x-ny!+ng!-...+ ng! =|S| = nl. 
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Solving for x, we obtain the desired formula. 

Second Proof. The second proof relies on 1.42 and an algebraic manipulation of factorials. 
We construct a typical object w = wyw2:-- Wp, € Ray’ ---a;,*) by making the following 
sequence of k choices. Intuitively, we are going to choose the positions of the a;’s, then the 
positions of the ag’s, etc. First, choose any n,-element subset S$; of {1,2,...,n} in any of 
(7) ways, and define w; = a, for all 2 € S,. Second, choose any nz-element subset Sz of 
{1,2,...,n} ~ $1 in any of ("7") ways, and define w; = ag for all i € S2. At the jth stage 
(where 1 < j < k), we have already filled the positions in S; U---U Sj-1 C {1,2,...,n}, 
and there are n — n1 — ng — +++ — nj—-1 remaining positions in the word. We choose any 
nj;-element subset Sj of these remaining positions in any of ("~"!~""~"/~") ways, and define 


w; = a; for all i € S;. By the product rule, the number of rearrangements is 


Clea are) uae le 


This is a telescoping product that simplifies to n!/(n1!nq!---n,!). For instance, when k = 4, 
the product is 


which simplifies to n!/(n1!n2!ng!n4!). (Recall that (n — ny — ng —-++-— ng)! =O!=1.) O 


1.47. Example. We now illustrate the constructions in each of the two preceding proofs. 
For the first proof, suppose we are counting R(a?b'c*). The alphabet A in the proof consists 
of the eight distinct letters 


A= {a ,a®,a®, 6), 2), 4. 


Let us build a specific permutation of A using the second counting method. First, choose 
an element of R(a?b'c*), say v = baccaacc. Second, choose a labeling of the a’s with su- 
perscripts, say ba) cca a) cc. Third, choose a labeling of the b’s, say bY a® ccaM a) ce. 
Finally, choose a labeling of the c’s, say bY a) cea a 4 eC), We have now con- 
structed a permutation of the alphabet A. 

Next, let us see how to build the word ‘baccaacc’ using the method of the second proof. 
Start with an empty 8-letter word, which we denote - - ------ . We first choose the 3-element 
subset {2,5,6} of {1,2,...,8} and put a’s in those positions, obtaining -a--aa--. We then 
choose the 1-element subset {1} of {1,3,4,7,8} and put a b in that position, obtaining 
ba--aa--. Finally, we choose the 4-element subset {3,4,7,8} of {3,4,7,8} and put c’s in 
those positions, obtaining the word baccaacc. 


1.48. Definition: Multinomial Coefficients. Suppose n1,...,n% are nonnegative inte- 
gers and n= ni +---+ ng. The multinomial coefficient is 


! 
( ‘ ) = C(n;m1, no, ...,M%) = * 


N1,72,--++5Nk ny!ng!--- np! 


This is the number of rearrangements of k letters where there are n,; copies of the 7th letter. 


1.49. Theorem: Binomial vs. Multinomial Coefficients. For all nonnegative integers 


a and b, we have 
at+b\  fa+b 
a Nii Bid € 
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Proof. The result is immediate from the formulas for binomial coefficients and multinomial 
coefficients as quotients of factorials, but we want to give a bijective proof. Let U be the 
set of a-element subsets of {1,2,...,a+b}, and let V = R(1%0°). We have already shown 
that |U| = (ea) and |V| = (“1”). So we must define a bijection f : U + V. Given 
SEU, let f(S) = wywo--- ways, where w; = y(i € S). Since S has a elements, f(S) is a 
word consisting of a ones and b zeroes. The inverse of f is the map f’ : V — U given by 
f' (wwe... Wat) = {i : w; = 1}. (Note that these maps are the restrictions to U and V of 
the maps f and f’ from the proof of 1.38.) oO 


1.50. Example: Compositions with k Parts. Let us determine the number of compo- 
sitions @ = (a1,...,Q@,) of n that have exactly k parts. Recall the bijection g : Comp,, > 
{0,1}"~+ from the proof of 1.41. Applying g to a produces a word with k — 1 ones and 
n — k zeroes. Conversely, any such word arises from a composition with k parts. Thus, g 
restricts to a bijection between the set of compositions of n with k parts and the set of 


words R(0"—*1*—"), Consequently, the number of such compositions is (Panton e 


= FE 


1.10 Lattice Paths 
1.51. Definition: Lattice Paths. A lattice path in the plane is a sequence 


P= ((2o, Yo); (x1, Yi); sey (Xk, Yk))s 
where the x;’s and y;’s are integers, and for each 7 > 1, either (xj, y;) = (wi-1 + 1, yi-1) or 
(vi, Ys) = (@i-1, Yi-1 + 1). We say that P is a path from (20, yo) to (Xk, Yr). 
We often take (29, yo) to be the origin (0,0). We represent P pictorially by drawing a 


line segment of length 1 from (a;~1, yi-1) to (xi, y;) for each i. For example, Figure 1.7 
displays the ten lattice paths from (0,0) to (2,3). 


FIGURE 1.7 
Lattice paths from (0,0) to (2,3). 


1.52. Theorem: Enumeration of Lattice Paths in a Rectangle. For all integers 
a,b > 0, there are (ar) = 4" lattice paths from (0,0) to (a, 0). 


a,b alb! 


Basic Counting 17 


pp be 


FIGURE 1.8 
Dyck paths of order 3. 


Proof. We can encode a lattice path P from (0,0) to (a,b) as a word w € R(E*N°) by 
setting w; = EF if (aj, yi) = (ai-1 +1, yi_1) and w; = N if (ai, ys) = (wi-1, ywi-1 +1). Here, E 
stands for “east step,” and N stands for “north step.” Since the path ends at (a,b), w must 
have exactly a occurrences of & and exactly b occurrences of N. Thus we have a bijection 
between the given set of lattice paths and the set R(E*N°). Since |R(E*N°)| = (G2) the 
theorem follows. O 


For example, the paths shown in Figure 1.7 are encoded by the words 


NNNEE, NNENE, NNEEN, NENNE, NENEN, 
NEENN, ENNNE, ENNEN, ENENN, EENNN. 


More generally, one can consider lattice paths in R?. Such a path is a sequence of points 
(vp, U1,.--, Uk) in Z* such that for each i, v4; = vj—1 + e; for some standard basis vector 
e; = (0,...,1,...,0) € R@ (the 1 occurs in position j). 


1.53. Theorem: Enumeration of Lattice Paths in a d-dimensional Rectangle. For 
all integers n1,...,q > 0, the number of d-dimensional lattice paths from (0,...,0) to 
(n1,...,Ma) is 


IR (chek? «--e#)| = ( 


Proof. Encode a path P by the word wiw2--+wn, where n = ny +--+ +nq and w; = e; iff 
VU, SU + ej. O 


gra, 
N1,72,+-++,Td 


Henceforth, we usually will make no distinction between a lattice path (which is a 
sequence of lattice points) and the word that encodes the lattice path. 
We now turn to a more difficult enumeration problem involving lattice paths. 


1.54. Definition: Dyck Paths. A Dyck path of order n is a lattice path from (0,0) to 
(n,n) such that y; > 2; for all points (a;,y;) on the path. This requirement means that the 


path always stays weakly above the line y = xz. For example, Figure 1.8 displays the five 
Dyck paths of order 3. 


1.55. Definition: Catalan Numbers. For n > 0, the nth Catalan number is 


C= 1 an\ sl Q2n+1\ (2n)! fan 2n 
" n+1\nnJ~ 2n+1\n41,n/ — ni(n41)! \nn n+1,n—1)° 


One may check that these expressions are all equal. For instance, 


Co? ita) a eee or an | ee) 


The first few Catalan numbers are 


Co=1, Cy=1, Co=2, C3=5, Cr=14, Cy =42, Co =132, Cy = 429. 
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(n+1,n-1) 


e 
(n+1,n-1) 


FIGURE 1.9 
Example of the reflection map r. 


1.56. Theorem: Enumeration of Dyck Paths. For n > 0, the number of Dyck paths 
of order n is the Catalan number C,, = (o) - eee 
Proof. The following proof is essentially due to André [3]. Let A be the set of all lattice 
paths from (0,0) to (n,n); let B be the set of all lattice paths from (0,0) to (n+ 1,n— 1); 
let C be the set of all Dyck paths of order n; and let D = A~C be the set of paths from 
(0,0) to (n,n) that do go strictly below the line y = x. Since C = A ~ D, the difference 
rule gives 

|C| = |A| — |D. 
We already know that |A| = pea and |B] = (eae To establish the desired formula 
|C| = Cn, it therefore suffices to exhibit a bijection r: D > B. 

We define r as follows. Given a path P € D, follow the path backwards from (n,n) 
until it goes below the diagonal y = x for the first time. Let (x;,y;) be the first lattice 
point we encounter that is below y = 2; this point must lie on the line y = x — 1. P is 
the concatenation of two lattice paths P,; and P2, where P; goes from (0,0) to (2;, y;) and 
Py goes from (2;,y;) to (n,n). By choice of 2, every lattice point of P, after (x;,y;) lies 
strictly above the line y = x — 1. Now, let Pj be the path from (2;,y;) to (n + 1,n— 1) 
obtained by reflecting P2 in the line y = x — 1. Define r(P) to be the concatenation of P; 
and P3. See Figure 1.9 for an example. Here, (2;,y;) = (7,6), Pi: = NEEENNENEEENN, 
P, = NNNEENE, and P} = EEENNEN. Note that r(P) is a lattice path from (0,0) to 
(n+1,n—1), so r(P) € B. Furthermore, (x;, y;) is the only lattice point of P} lying on the 
liney=a—1. 

The inverse map r’ : B — D acts as follows. Given Q € B, choose 7 maximal such 
that (2;,y;) is a point of Q on the line y = x — 1. Such an 7 must exist, since there is 
no way for a lattice path to reach (n + 1,n — 1) from (0,0) without passing through this 
line. Write Q = Q1Q2, where Q: goes from (0,0) to (a, y;) and Qe goes from (2;, y;) to 
(n+1,n—1). Let Q4 be the reflection of Q2 in the line y = «—1. Define r’(Q) = Q1Q$, and 
note that this is a lattice path from (0,0) to (n,n) which passes through (2;, y;), and hence 
lies in D. See Figure 1.10 for an example. Here, (2;, y;) = (6,5), Q1 = NNENEEEENEN, 
Q2 = EEENENENN, and QS = NNNENENEE. From our observations about the point 
(x;,y;) in this paragraph and the last, one sees that r’ is the two-sided inverse of r. O 
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FIGURE 1.10 
Example of the inverse reflection map. 


The technique used in the preceding proof is called André’s reflection principle. Another 


Ll (In41 
( a ) sis wiven in §12. 


proof of the theorem, which leads directly to the formula 
2n+1\n+1,n 


1 
1 2 
Yet another proof, which leads directly to the formula ai ( i ), is given in §12.2. 
n n,n 


DT 


1.11 Multisets 


Recall that the concepts of order and multiplicity play no role when deciding whether two 
sets are equal. For instance, {1,3,5} = {3,5,1} = {1,1,1,5,5,3,3,3} since all these sets 
have the same members. We now introduce the concept of a multiset, in which order still 
does not matter, but repetitions of a given element are significant. 


1.57. Definition: Multisets. A multiset is an ordered pair M = (S,m), where S is a set 
and m: S — N* is a function. For « € S, the number m(z) is called the multiplicity of x 
in M. The number of elements of M is |M| = YO .cgm(z). 


In contrast, the number of distinct elements in M is |.S'|. We sometimes display a multiset 
as a list [v1,22,...,2%], where each x € S occurs exactly m(x) times in the list. However, 
one must remember that the order of the elements in this list does not matter when deciding 
equality of multisets. For example, [1, 1, 2,3,3] 4 [1,1,1,2,3,3] = [3,2,3,1,1,1]. We often 
visualize M as a circle with the elements x € S appearing inside, each repeated m(z) 
times. For example, Figure 1.11 displays the twenty 3-element multisets using letters in the 
alphabet {w, x,y, z}. The last circle in the first row represents the multiset [w, x, x], which 
is formally the ordered pair ({w,z},m) such that m(w) = 1 and m(a) = 2. 


1.58. Theorem: Enumeration of Multisets. The number of k-element multisets using 
letters from an n-letter alphabet is 


byore 7 iE 
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FIGURE 1.11 
The 3-element multisets over the alphabet {w, x, y, z}. 


Proof. We give two proofs of this result. First Proof: Let A be a fixed n-letter alphabet, and 
let U be the set of all k-element multisets using letters from A. Introduce the two symbols 
x (“star”) and | (“bar”), and let V = R(x” |"~1) be the set of all rearrangements of k stars 


k —1 
and n— 1 bars. We know that |V| = ( aa 


k,n-1 
f:U0-V. 
Let (a1, @2,...,@,) be a fixed ordering of the alphabet A. Let M = (S,m) be a typical 
multiset in U. Set m(a;) = 0 if a; ¢ S. Define 


) . It therefore suffices to define a bijection 


f(M)= gman) rete) || im (an—1) [x™(an) eV. 


In other words, we write a star for each occurrence of a; in M (if any), then a bar, then 
a star for each occurrence of a2 in M (if any), then a bar, etc. There is no bar after the 
stars for a, so there are only n — 1 bars total. Since M has k& elements, there are k stars 
total. Thus f(M/) really is an element of V. For example, the multisets in the first column 
of Figure 1.11 are mapped to the following star-bar words: 


f((w, w, w]) = rll] f(lw,2,y)) =slal*l, fle, a, a]) =|**x]], fla, 2, 2]) = [Ilex 


The multiset I is uniquely determined by f(M/). More precisely, define f’ : V — U by 
letting f’(«7™ | «2 |---|k™") be the unique multiset that has m, copies of a; for 1<i<n 
(here m; > 0). Since 37", m; = k, this is a k-element multiset using letters from A. For 
example, ifn = 6, k = 4, and A = {1,2,3,4,5,6}, then f’(|| «|| «| *«*) = [3,5,6,6]. One 
may check that f’ is the two-sided inverse of f. 

Second Proof: We may assume (without loss of generality) that the alphabet A is 
{1,2,...,n}. As above, let U be the set of all k-element multisets using letters from A. 
Let W be the set of all k-element subsets of B = {1,2,...,4 +n— 1}. We know that 
i Vee One i Cer: So it suffices to define a bijection g: U > W. 

Given M € U, we can write M uniquely in the form M = [21,22,..., 2%] by requiring 
that 41 < 29g <--- < a2,. Now define 


g(M) = g((a1, 22,.--,2e]) = {a1 +0, 22 + 1,23 4+2,...,0; + (¢-1),...,a% + (k- LD}. 
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For example, ifn = 5, & = 5, and M = ([1,1,4,5,5], then g(M) = {1,2,6,8,9} C 
{1,2,...,9}. Notice that the elements of the set g(M) all lie in {1,2,...,4 + -— 1} since 
1 <a; <n for all i. Also, the & displayed elements of g(M) are pairwise distinct because, 
for any i < j, the assumption x; < x; implies x; + (i-— 1) < 2; + (j — 1). Thus, g(M) is 
indeed a k-element subset of B. 

Going the other way, define g’ : W — U as follows. Given S € W, we can write S$ 
uniquely in the form S = {y1,y2,..-, yx} where yi < yo <--+: < yx. Now define 


J (S) = 9 (fyi, y2,---, yet) = [yr — 9,42 -1,.--,ys —(@-1),.--, yx — (kD). 


For example, ifn =k =5 and S = {2,3,5,7,8}, then g’(S) = [2,2,3,4,4] € U. Since every 
yj = 1 and the y,’s form a strictly increasing sequence of integers, it follows that 7 < y; for 
all 7. Similarly, since every y; < k +m —1 and there are k — 1 entries that exceed y; in the 
sequence (namely y;41,---,Yx), we deduce that y; < (k+n-—1)—(k—-1) =n+i-—1 for all 
i. Subtracting i — 1, it follows that every element of the k-element multiset g’(S) lies in the 
range {1,2,...,n}, so that g’(S) really is an element of U. It is now routine to check that 
g’ is the two-sided inverse of g. oO 


DS 


1.12 Probability 


The basic techniques of counting can be applied to solve a number of problems from probabil- 
ity theory. This section introduces some fundamental concepts of probability and considers 
several examples. 


1.59. Definition: Sample Spaces and Events. A sample space is a set ', whose members 
represent the possible outcomes of a “random experiment.” In this section, we only consider 
finite sample spaces. An event is a subset of the sample space. 


Intuitively, an event consists of the set of outcomes of the experiment that possess a 
particular property we are interested in. 


1.60. Example: Coin Tossing. Suppose the experiment consists of tossing a coin five 
times. We could take the sample space for this experiment to be S = {H,T}°, the set of all 
5-letter words using the letters H (for heads) and T (for tails). The element HHHTH € S 
represents the outcome where the fourth toss was tails and all other tosses were heads. The 
subset A = {w € S : w; = H} is the event in which the first toss comes up heads. The 
subset B = {w € S: wi 4 ws} is the event that the first toss is different from the last toss. 
The subset 
C ={we€éS:w; =T for an odd number of indices 7} 


is the event that we get an odd number of tails. 


1.61. Example: Dice Rolling. Suppose the experiment consists of rolling a six-sided die 
three times. The sample space for this experiment is S = {1,2,3,4,5,6}%, the set of all 
3-letter words over the alphabet {1,2,...,6}. The subset A = {w € S: wi + we + wz € 
{7,11}} is the event that the sum of the three numbers rolled is 7 or 11. The subset 
B={weS:w, =w2 = ws} is the event that all three numbers rolled are the same. The 
subset C = {we€ S: w 4 (4,1,3)} is the event that we do not see the numbers 4, 1, 3 (in 
that order) in the dice rolls. 
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1.62. Example: Lotteries. Consider the following random experiment. We put 49 white 
balls (numbered 1 through 49) into a machine that mixes the balls for awhile and then 
outputs a sequence of six distinct balls, one at a time. We could take the sample space here to 
be the set S’ of all 6-letter words w consisting of six distinct letters from A = {1,2,...,49}. 
In lotteries, the order in which the balls are drawn usually does not matter, so it is more 
common to take the sample space to be the set S' of all 6-element subsets of A. (We will 
see later that using S instead of S’ does not affect the probabilities we are interested in.) 
Suppose a lottery player picks a (fixed and known) 6-element subset To of A. For0 << k < 6, 
define events By, = {T € S: |TNTp| = k} C S. Intuitively, the event By, is the set of 
outcomes in which the player has matched exactly k of the winning lottery numbers. 


1.63. Example: Special Events. For any sample space S, # and S are events. Intuitively, 
the event ( contains no outcomes, and therefore “never happens.” On the other hand, the 
event S' contains all the outcomes, and therefore “always happens.” If A and B are events 
(i.e., subsets of S$), note that AUB, AN B, S ~ A, and A ~ B are also events. Intuitively, 
AU B is the event that either A happens or B happens (or both); AN B is the event that 
both A and B happen; S' ~ A is the event that A does not happen; and A ~ B is the event 
that A happens but B does not happen. 


Now we can formally define the concept of probability. Intuitively, for each event A, we 
want to define a number P(A) that measures the probability or likelihood that A occurs. 
Numbers close to 1 represent more likely events, while numbers close to 0 represent less likely 
events. A probability-zero event is “impossible,” while a probability-one event is “certain” 
to occur. 


1.64. Definition: Probability. Assume S is a finite sample space. Recall that P(S) is the 
set of all subsets of S, i.e., the set of all events. A probability measure for S' is a function 
P:P(S) — [0,1] such that P(@) = 0; P(S) = 1; and for any two disjoint events A and B, 
P(AUB) = P(A) + P(B). 


By induction, it follows that P satisfies the finite additivity property 
P(A, U Ag U---UA,) = P(A) + P(A) +--+ + P(An) 
for all pairwise disjoint sets A,, Ao,..., An CS. 


1.65. Example: Classical Probability Spaces. Suppose S is a finite sample space in 
which all outcomes are equally likely. Then we must have P({x}) = 1/|S| for each outcome 
x € S. For any event A C S, finite additivity gives 


__ |A| _ number of favorable outcomes 


P(A) (1.1) 


~ |S] total number of outcomes" 


Thus the calculation of probabilities (in this classical setup) reduces to two counting prob- 
lems: counting the number of elements in A and counting the number of elements in S. 
We can take equation (1.1) as the definition of our probability measure P. Note that the 
axiom AN B= 9 => P(AU B) = P(A) + P(B) is then a consequence of the sum rule. Also 
note that this probability model will only be appropriate if all the possible outcomes of the 
underlying random experiment are equally likely to occur. 


1.66. Example: Coin Tossing. Suppose we toss a fair coin five times. The sample space 
is S = {H,T}°, so that |.$| = 2° = 32. Consider the event A = {w € S: w1 = H} of getting 
a head on the first toss. By the product rule, |A] = 1- 2+ = 16, so P(A) = 16/32 = 1/2. 
Consider the event B = {w € S: wi # ws} in which the first toss differs from the last toss. 
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B is the disjoint union of B, = {w € S:w, = H,ws =T} and Bp = {we S:w, =T,ws = 
H}. The product rule shows that |B,| = |By| = 2? = 8, so that P(B) = (8 + 8)/32 = 1/2. 
Finally, consider the event 


C= {weéS:w; =T for an odd number of indices i}. 


C is the disjoint union C,; UC3UCs, where (for 0 < k < 5) Cy is the event of getting exactly 
k tails. We have Cy = R(T*H*—*), so that P(C;,) = (2) /2°. Therefore, 


P(C) = G+) +l) = 16/32 = 1/2. 


1.67. Example: Dice Rolling. Consider the experiment of rolling a six-sided die twice. 
The sample space is S = {1,2,3,4,5,6}?, so that |S| = 6? = 36. Consider the event 
A= {xe S: 41+ 42 € {7,11}} of rolling a sum of 7 or 11. By direct enumeration, we have 


A= {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1), (5, 6), (6, 5); | Al = 8. 


Therefore, P(A) = 8/36 = 2/9. Consider the event B = {x € S: x1 # x2} of getting two 
different numbers on the two rolls. The product rule gives |B] = 6-5 = 30, so P(B) = 
30/36 = 5/6. 


1.68. Example: Balls in Urns. Suppose an urn contains n,; red balls, nz white balls, and 
ng blue balls. Let the random experiment consist of randomly drawing a k-element subset 
of balls from the urn. What is the probability of drawing k, red balls, kg white balls, and 
kz blue balls, where ki + ko + k3 = k? We can take the sample space S to be all k-element 
subsets of the set 


{1,2,...,n1,n1 +1,...,n1 + ne,n1tno4+1,...,n1 + ne + nz}. 


Here the first ni integers represent red balls, the next ng integers represent white balls, and 
the last ng integers represent blue balls. We know that |S| = (""*"2'"?). Let A be the event 
where we draw k; red balls, kz white balls, and kg blue balls. To build a set T € A, we choose 
a k,-element subset of {1,2,...,1}, then a k-element subset of {n1+1,...,n1+n}, thena 
k3-element subset of {ny+ng+1,...,n21+n2+7n3}. By the product rule, |A| = (i) (es Go): 
Therefore, the definition of the probability measure gives 


(i) (ia) ) 
P(A) = Afi] \ka} is} 
ny tne +n3 
ae 
This calculation can be generalized to the case where the urn has balls of more than three 
colors. 


1.69. Example: Lotteries. Consider the lottery described in 1.62. Here the sample space 
S consists of all 6-element subsets of A = {1,2,...,49}, so |S| = CG) = 13,983,816. 
Suppose a lottery player picks a (fixed and known) 6-element subset Tp of A. For0 <k < 6, 
define events B, = {T € S:|TNTo| =k}. By, occurs when the player matches exactly k of 
the winning numbers. We can build a typical object T € By, by choosing k elements of To 


in (?) ways, and then choosing 6 — k elements of A ~ To in (2) ways. Hence, 


P(Bx) = Glo") 
(s) 
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TABLE 1.2 

Analysis of Virginia’s “Lotto South” lottery. 
Matches | Probability 
0.01765 or 1 in 57 
0.0009686 or 1 in 1032 about $75 
0.00001845 or 1 in 54,201 about $1000 
7.15 x 10-8 or 1 in 13,983,816 | jackpot 


Prize Value 


Table 1.2 shows the probability of matching & numbers, for 3 < k < 6. The table also shows 
the amount of money one would win in the various cases. One can view this example as 
the special case of the previous example where the urn contains 6 balls of one color and 43 
balls of another color. 


In the lottery example, suppose we took the sample space to be the set S’ of all ordered 
sequences of six distinct elements of {1,2,...,49}. Let Bi, be the event that the player 
guesses exactly & numbers correctly (disregarding order, as usual). Let P’ be the probability 
measure on the sample space S’. One may check that |.S”| = ea) -6! and |By,| = (2) CS) 6), 


so that 

(&) (o-) 6! 
(5) 6! 
This confirms our earlier remark that the two sample spaces S and S” give the same prob- 
abilities for events that do not depend on the order in which the balls are drawn. 


P'(By) = = P(Bx). 


1.70. Example: Lattice Paths. Suppose we randomly choose a lattice path from (0,0) 
to (n,n). What is the probability that this path is a Dyck path? We know that there are 
oA (?") Dyck paths and (a) lattice paths ending at (n,n). Therefore, the probability 


is 1/(n + 1). We discuss a remarkable generalization of this result, called the Chung-Feller 
Theorem, in $12.2. 


1.71. Example: General Probability Measures on a Finite Sample Space. We now 
extend the previous discussion to the case where not all outcomes of the random experiment 
are equally likely. Let S be a finite sample space and let p: S — [0,1] be a map such that 
Yxceg P(x) = 1. Intuitively, p(x) is the probability that the outcome x occurs. Now p is 
not a probability measure, since its domain is S instead of P(.S'). We build a probability 
measure from p by defining P(A) = >),,<4 p(x). The axioms for a probability measure may 
be routinely verified. A similar construction works in the case where S is a countably infinite 
sample space. (Recall that a set S' is countably infinite iff there exists a bijection f : N — S.) 


1.72. Remark. In this section, we used counting techniques to solve basic probability 
questions. It is also possible to use probabilistic arguments to help solve counting problems. 
Examples of such arguments appear in §12.4 and §12.10. 


DS 


1.13 Games of Chance 


In this section, we use counting techniques to analyze two popular games of chance: power- 
ball lotteries and five-card poker. 
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TABLE 1.3 
Analysis of the Powerball lottery. 
Matches 

0 white, 1 red 
1 white, 1 red 
2 white, 1 red 
3 white, 0 red 
3 white, 1 red 
4 white, 0 red 
4 white, 1 red 
5 white, 0 red 
5 white, 1 red 


Probability Prize Value 
0.0145 or 1 in 69 
0.00788 or 1 in 127 

0.00134 or 1 in 745 

0.00344 or 1 in 291 

0.0000838 or 1 in 11,927 
0.0000702 or 1 in 14,254 
0.000001711 or 1 in 584,432 
2.81 x 107" or 1 in 3.56 million 


6.844 x 107° or 1 in 146 million 


$200,000 
jackpot 


1.73. Example: Powerball. A powerball lottery has two kinds of balls: white balls (num- 
bered 1,..., 44) and red balls (numbered 1,..., &). Each week, one red ball and a set of n 
distinct white balls are randomly chosen. Lottery players guess what the n white balls will 
be, and they also guess the red ball (called the “power ball”). Players win prizes based on 
how many balls they guess correctly. Players always win a prize for matching the red ball, 
even if they incorrectly guess all the white balls. 

To analyze this lottery, let the sample space be 


S={(T,x):T is an n-element subset of {1,2,...,M} and ae {1,2,..., R}}. 


Let (To, 29) be a fixed and known element of S representing a given player’s lottery ticket. 
For 0<k <n, let Ay be the event {(T,x2) € S:|TATo| = k,x A xo} in which the player 
matches exactly / white balls but misses the power ball. Let B; be the event {(T,z) € S: 
|T 1 To| = k,x = xo} in which the player matches exactly k white balls and also matches 
the power ball. We have |.S| = eaees by the product rule. To build a typical element in A,, 
we first choose k elements of To, then choose n — k elements of {1,2,..., MM} ~ Tp, then 
choose a € {1,2,..., R} ~ {ao}. Thus, |Ax| = (7) (“7")(R- 1), so 


Similarly, 


In one version of this lottery, we have M = 55, R = 42, and n = 5. The probabilities of 
certain events A; and B; are shown in Table 1.3 together with the associated prize amounts. 


Now we turn to an analysis of five-card poker. 


1.74. Definition: Cards. A suit is an element of the 4-element set Suits = {&,>,0,@}.A 
value is an element of the 13-element set Values = {2,3,4,5,6,7,8,9, 10, J,Q,K, A}, where 
J, Q, K, and A stand for “jack,” “queen,” “king,” and “ace,” respectively. A card is an 
element of the set Deck = Values x Suits. 


Note that |Deck| = 13-4 = 52, by the product rule. For instance, (A, @) € Deck. We 
often abbreviate this notation to A@, and similarly for other cards. 
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1.75. Definition: Poker Hands. A (five-card) poker hand is a 5-element subset of Deck. 
Given such a hand H, let V(#) be the set of values that appear among the cards of H, 
and let S(H) be the set of suits that appear among the cards of H. For each x € Values, 
let nz(H) be the number of cards in H with value z. 


1.76. Example. H = {AO, 3,30, J, Kde} is a five-card poker hand with V(H) = 
{A,3, J, K}, S(H) = {%, d}, n3(H) = 2, na(A#) = n3(A) = nx (A) = 1, and n,(H) = 0 
for alla € V(H). 


We now study the sample space X consisting of all five-card poker hands. We know that 
|X| = (=) = 2,598, 960. In poker, certain hands in X play a special role. We define these 
hands now. 


1.77. Definition: Special Card Hands. Let H be a five-card poker hand. 
e H isa four-of-a-kind hand iff there exists x € Values with n,(H) = 4. 
e H isa full house iff there exist x,y € Values with n,(H) = 3 and n,(H) = 2. 


e Hf isa three-of-a-kind hand iff there exist x,y,z € Values with y 4 z, n.(H) = 3, and 
ny(H) =n,(H) =1. 


e Hf is a two-pair hand iff there exist x,y,z € Values with y # z, n,(H) = 1, and 
ny(H) =n,(H) = 2. 


e Hf is a one-pair hand iff there exist distinct w,z,y,z € Values with n,,(H) = 2 and 
Ng(H) =n, (Hf) =n,(#) =1. 


e H isa straight iff V(#) is one of the following sets: 
{A, 2,3,4,5} or {4,4 +1,¢+ 2,1+ 3,74 4} for some i with 2<1<6 


or {7,8,9, 10, J} or {8, 9, 10, J, Q} or {9, 10, J,Q, K} or {10, J, Q, K, A}. 


e H isa flush iff |S(H)| = 1. 

e 7 isa straight flush iff H is a straight and a flush. 

e A is an ordinary hand iff H satisfies none of the above conditions. 
1.78. Example: Card Hands. 

© {5@, 8h, 5, 5&, 50} is a four-of-a-kind hand. 

© {Jh, 9h, J, Jd, 99} is a full house. 

© {Jh, 2h, J), J&, 99} is a three-of-a-kind hand. 

© {20, 9%, KO, 2&,90} is a two-pair hand. 

© {9d 100, 109, Ade, 4d} is a one-pair hand. 

© {7h&, 60,39, 5a, 4} is a straight that is not a flush. 


e {109, 39, QY, JY, 89} is a flush that is not a straight. 
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TABLE 1.4 
Probability of five-card poker hands. 


Card Hand Number | Probability 
straight flush 1.54 x 107 
four-of-a-kind 0.00024 
full house 0.00144 
flush (not straight) 0.001965 
straight (not flush) 0.00392 


three-of-a-kind 0.02113 
two pair 0.04754 
one pair 0.42257 
none of the above 0.50117 


{10d, Ja, Q&, K&, Ad} is a straight flush. (A straight flush such as this one, which 
“starts at 10 and ends at A,” is called a royal flush. There are four royal flushes, one for 
each suit.) 


{9&%, 100,79, Ade, 4} is an ordinary hand. 


We now compute the probability of the various five-card poker hands. This amounts to 


enumerating the hands of each type and dividing these counts by |X| = ea = 2,598, 960. 
Our results are summarized in Table 1.4. In each case, the desired counting result will 
follow from careful applications of the product rule. Less frequently occurring poker hands 
are more valuable in the game. So, for instance, a flush beats a straight. A full house beats 
both a straight and a flush separately, but is beaten by a straight flush. 


Four-of-a-kind hands. 'To build a typical four-of-a-kind hand H, first choose the value 
x that occurs 4 times in any of | Values| = 13 ways. All four cards of this value must 
belong to H. Second, choose the fifth card of H in any of 52 — 4 = 48 ways. This gives 
13 x 48 = 624 four-of-a-kind hands. The sample hand above was constructed by choosing 
the value 5 followed by the card 8@. 


Full house hands. To build a typical full house H, first choose a value x € Values to 
occur 3 times. This can be done in 13 ways. Second, choose 3 of the 4 cards of value 
x to appear in the hand. This can be done in (3) = 4 ways. Third, choose a value 
y € Values ~ {x} to occur twice in H. This can be done in 12 ways. Fourth, choose 2 
of the 4 cards of value y to appear in the hand. This can be done in (3) = 6 ways. The 
total is 13-4-12-6 = 3744 full house hands. The sample hand above was constructed 
by choosing the value J, then the three cards {J@, J}, J}, then the value 9, then the 
two cards {9,99}. 


Three-of-a-kind hands. To build a typical three-of-a-kind hand H, first choose a value 
x € Values to occur 3 times. This can be done in 13 ways. Second, choose 3 of the 4 
cards of value x to appear in the hand. This can be done in (3) = 4 ways. Third, choose 
a set of 2 values {y, z} C Values ~ {a} that will occur once each in H. This can be done 
in G) = 66 ways. Let the notation be such that y < z (where 10< J<Q< K < A). 
Fourth, choose one of the 4 cards of value y to be in the hand in any of 4 ways. Fifth, 
choose one of the 4 cards of value z to be in the hand in any of 4 ways. The total is 
13-4-66-4-4= 54,912. The sample hand above was constructed by choosing the value 
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J, then the three cards {J@, J), J&}, then the values {2,9}, then the card 2%, and 
then the card 99. 


Two-pair hands. To build a typical two-pair hand H, first choose a set of two values 
{x,y} © Values to occur twice each. This can be done in Ce) = 78 ways. Let the 
notation be such that « < y. Second, choose a set of two cards of value x in any of 
(3) = 6 ways. Third, choose a set of two cards of value y in any of e) = 6 ways. Fourth, 
choose the last card in the hand. Since this card cannot have value x or y, the number 
of possibilities here is 52 — 8 = 44. The total is 78-6-6-44 = 123,552. The sample hand 
above was constructed by choosing the values {2,9}, then the cards {2@, 2&}, then the 
cards {9%, 90}, then the card Kd. 


One-pair hands. To build a typical one-pair hand H, first choose a value w to occur 
twice in the hand. This can be done in 13 ways. Second, choose a set of two cards of 
value w in any of (5) = 6 ways. Third, choose a set {x,y,z} C Values ~ {x} (where 
x<y<_z) in any of () = 220 ways. Fourth, choose a card of value x in 4 ways. Fifth, 
choose a card of value y in 4 ways. Sixth, choose a card of value z in 4 ways. The total 
is 138-6-220-4-4-4 = 1,098, 240. The sample hand above was constructed by choosing 
the value w = 10, then the cards {10,1090}, then the values {4,9, A}, then the card 
4d&, then the card 9&, then the card Ad&. 


Straight hands. To build a typical straight H, first choose one of the ten allowable sets 
V(#) in the definition of a straight. Then, for each of the five distinct values in V(#), 
taken in increasing order, choose a suit for the card of that value. This can be done 
in 4 ways for each value. The total is 10- 4° = 10,240. The sample hand above was 
constructed by choosing the value set V(H) = {3,4,5,6, 7}, then the suit 9 for the 3, 
then the suit & for the 4, then the suit & for the 5, then the suit > for the 6, and then 
the suit #& for the 7. In the table entry for straights, we subtract the number of straight 
flushes (namely 40, as shown below) so that the entries in the table will be pairwise 
disjoint subsets of X. 


Flush hands. To build a typical flush H, first choose the one-element set S(H) in any 
of Ge = 4 ways. Then choose the five-element set V(#) in any of () ways. H is now 
completely determined since all cards in H have the same suit. The total is therefore 
4-(') = 5148. The sample hand above was constructed by choosing S(H) = {V}, 
then V(H) = {3,8,10, J,Q}. In the table entry for flushes, we subtract the number of 
straight flushes (namely 40, as shown below) so that the entries in the table will be 
pairwise disjoint subsets of X. 


Straight flushes. To build a typical straight flush H, first choose one of the ten allowable 
sets V(H) in the definition of a straight. Then choose one of the four suits to be the 
common suit of all cards in H. The total is 10-4 = 40. The sample hand above was 
constructed by choosing V(#H) = {10, J,Q, K, A} and then S(H) = {@}. 


Ordinary hands. To count ordinary hands, one can subtract the total of the preceding 
counts from |X|. However, the answer can also be obtained directly from the product 
rule as follows. To build an ordinary hand H, first choose the value set V(H). We 
must have |V(H)| = 5 to avoid hands such as two-pair, full house, etc. Also we must 
avoid the ten special choices of V(#) in the definition of straight (all of which are five- 
element sets). We conclude that V(H) can be chosen in Ge ) — 10 = 1277 ways. Write 
V(A) = {v1, v2, v3, U4, U5}, where vy < vg < v3 < U4 < Us. For each v; in turn, choose 
the suit for the card of that value in any of 4 ways. This would give 4° choices, but we 
must avoid the four choice sequences in which all v;’s are assigned the same suit (which 
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would lead to a flush). So there are only 4° — 4 = 1020 ways to assign suits to the 
chosen values. The hand is now completely determined, so the total number of ordinary 
hands is 1277-1020 = 1, 302,540. The sample hand above was constructed by choosing 
V(H) = {4,7,9,10, A}, then & as the suit for the 4, Y as the suit for the 7, & as the 
suit for the 9, > as the suit for the 10, and & as the suit for the ace. 


DS 


1.14 Conditional Probability and Independence 


Suppose that, in a certain random experiment, we are told that a particular event has 
occurred. Given this additional information, we can recompute the probability of other 
events occurring. This leads to the notion of conditional probability. 


1.79. Definition: Conditional Probability. Suppose A and B are events in some sample 
space S' such that P(B) > 0. The conditional probability of A given B, denoted P(A|B), is 
defined by setting 

P(ANB) 


P(AIB) = Som 


In the case where S is a finite set of equally likely outcomes, we have P(A|B) = 
|AM B|/|B|. This conditional probability need not have any relation to the unconditional 
probability of A, which is P(A) = |A|/|S]. 


1.80. Example: Dice Rolling. Consider the experiment of rolling a fair die twice. What 
is the probability of getting a sum of 7 or 11, given that the second roll comes up 5? Here, 
the sample space is S = {1,2,3,4,5,6}?. Let A be the event of getting a sum of 7 or 
11, and let B be the event that the second die shows 5. We have P(B) = 1/6, and we 
saw earlier that P(A) = 2/9. Listing outcomes, we see that AM B = {(2,5), (6,5)}, so 
P(AN B) = 2/36 = 1/18. Therefore, the required conditional probability is 


P(ANB) _ 1/18 


NEY iBy ie 


= 1/3 > 2/9 = P(A). 


On the other hand, let C be the event that the second roll comes up 4. Here ANC = {(3, 4)}, 


sO 
1/36 


6/36 
Next, let D be the event that the first roll is an odd number. Then 


P(A|C) = = 1/6 < 2/9 = P(A). 


AND= £156) (3, 4), (5, 2), (5, 6)}, 


sO 


These examples show that the conditional probability of A given some other event can be 
greater than, less than, or equal to the unconditional probability of A. 


1.81. Example: Balls in Urns. Suppose an urn contains r red balls and 6b blue balls, 
where r,b > 2. Consider an experiment in which two balls are drawn from the urn in 
succession, without replacement. What is the probability that the first ball is red, given 
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that the second ball is blue? We take the sample space to be the set S of all words w wo, 
where w, 4 wz and 
W1, We € {1,2,...,77r4+1,...,r +}. 


Here, the numbers 1 through r represent red balls and the numbers r + 1 through r + 6 
represent blue balls. The event of drawing a red ball first is the subset 


A= {wiwo:1< wi <r}. 
The event of drawing a blue ball second is the subset 


B={wyw2:rt1l<we<rt+d}. 


By the product rule, |S| = (r + 6)(r + 6-1), |A] = r(r + 6-1), |B] = b(r + b— 1), and 
|AN B| = rb. The conditional probability of A given B is 


P(A|B) = P(AN B)/P(B) =r/(r+b-1). 
In contrast, the unconditional probability of A is 
P(A) = |AJ/|S| = r/(r +0). 


The conditional probability is slightly higher than the unconditional probability; intuitively, 
we are more likely to have gotten a red ball first if we know the second ball was not red. 
The probability that the second ball is blue, given that the first ball is red, is 


P(B\A) = P(BN A)/P(A) = b/(r+b- 1). 
Note that P(B|A) 4 P(A|B) (unless r = 0). 


1.82. Example: Card Hands. What is the probability that a 5-card poker hand is a 
full house, given that the hand is void in clubs (i.e., no card in the hand is a club)? Let 
A be the event of getting a full house, and let B be the event of being void in clubs. We 
have |B| = (eo) = 575,757 since we must choose a five-element subset of the 52 — 13 = 39 
non-club cards. Next, we must compute |AM B|. To build a full house hand using no 
clubs, make the following choices: first, choose a value to occur three times (13 ways); 
second, choose the suits for this value (1 way, as clubs are forbidden); third, choose a 
value to occur twice (12 ways); fourth, choose the suits for this value ((3) = 3 ways). By 
the product rule, |AN B| = 13-1-12-3 = 468. Accordingly, the probability we want is 
P(A|B) = 468/575, 757 = 0.000813. 

Next, what is the probability of getting a full house, given that the hand has at least 
two cards of the same value? Let C’ be the event that at least two cards in the hand have 
the same value; we seek P(A|C) = P(ANC)/P(C) = |ANC|/|C|. The numerator here can 
be computed quickly: since A C C, we have ANC = A and hence |ANC| = |A| = 3744 (see 
Table 1.4). To compute the denominator, let us first enumerate X ~ C’, where X is the full 
sample space of all five-card poker hands. Note that X ~ C occurs iff all five cards in the 
hand have different values. Choose these values (Ge) ways), and then choose suits for each 


5 
card (4 ways each). By the product rule, |X ~ C| = 1,317, 888. So 


|C| = |X| - |X ~ C| = 1, 281,072. 


The desired conditional probability is 


3744 
P(A|C) = —————. = 0.00292. 
male) 1, 281, 072 
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In some situations, the knowledge that a particular event D occurs does not change 
the probability that another event A will occur. For instance, events D and A in the dice 
rolling example 1.80 have this property because P(A|D) = P(A). Writing out the definition 
of P(A|D) and multiplying by P(D), we see that the stated property is equivalent to 
P(AND) = P(A)P(D) (assuming P(D) > 0). This suggests the following definition, which 
is valid even when P(D) = 0. 


1.83. Definition: Independence of Two Events. Two events A and D are called 
independent iff 
P(AN D) = P(A)P(D). 


Unlike the definition of conditional probability, this definition is symmetric in A and 
D. So, A and D are independent iff D and A are independent. As indicated above, when 
P(D) > 0, independence of A and D is equivalent to P(A|D) = P(A). Similarly, when 
P(A) > 0, independence of A and D is equivalent to P(D|A) = P(A). So, when considering 
two independent events of positive probability, knowledge that either event has occurred 
gives us no new information about the probability of the other event occurring. 


1.84. Definition: Independence of a Collection of Events. Suppose A1,...,A, are 
events. This list of events is called independent iff for all choices of indices 11 < ig <--- < 
1K < n, 

P(A;, N Ai, N-++N Aj) = P(Aji,)- P(Aj) +... + P(Ai,)- 


1.85. Example. Let S = {a, b,c, d}, and suppose each outcome in S occurs with probability 
1/4. Define events B = {a,b}, C = {a,c}, and D = {a,d}. One verifies immediately that B 
and C are independent; B and D are independent; and C’ and D are independent. However, 
the triple of events B,C, D is not independent, because 


P(BNCND) = P({a}) = 1/4 41/8 = P(B)P(C)P(D). 


1.86. Example: Coin Tossing. Suppose we toss a fair coin 5 times. Take the sample 
space to be S = {H,T}°. Let A be the event that the first and last toss agree; let B be 
the event that the third toss is tails; let C' be the event that there are an odd number of 
heads. Routine counting arguments show that |.S| = 2° = 32, |A| = 2+ = 16, |B] = 24 = 16, 
IC] = (+0) +() = 16, AnB| = 2 =8, |Anc| = 2((3) +0) = 8, BNC| = ()+(4) =8. 
and |AN BN C| = 4. It follows that 


P(AN B) = P(A)P(B); P(ANC)=P(A)P(C); P(BNC) = P(B)P(C); 
P(AN BNC) = P(A)P(B)P(C). 
Thus, the triple of events (A, B,C) is independent. 


We often assume that unrelated physical events are independent (in the mathematical 
sense) to help us construct a probability model. The next example illustrates this process. 


1.87. Example: Tossing an Unfair Coin. Consider a random experiment in which we 
toss an unbalanced coin n times in a row. Suppose that the coin comes up heads with 
probability q and tails with probability 1 — q, and that successive coin tosses are unrelated 
to one another. Let the sample space be S = {H,T}"”. Since the coin is unfair, it is not 
appropriate to assume that every point of S occurs with equal probability. Given an outcome 
WwW = W1W2°+:Wr € S, what should the probability p(w) be? Consider an example where 
n = 5 and w = HHTHT. Consider the five events B} = {z € S: 2, = H}, Bo= {ze S: 
zo = Hs, Bg ={z€S:23=T}, Ba ={z€S: 24 =H}, and Bs ={zeES: 25 =T}. Our 
physical assumptions suggest that Bi,...,Bs5 should be independent events (since different 
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tosses of the coin are unrelated), P(B,) = P(B2) = P(B4) =q, and P(B3) = P(Bs) = 1-g. 
Since B,N Bz N B39 BaN Bs = {w}, the definition of independence leads to 


p(w) = P(B,N---N Bs) = P(By)P(B2)--: P(Bs) = qq(1 — ga - g) = (1 - 9)’. 


Similar reasoning shows that if w = w,w2---Wn € S$ is an outcome consisting of k heads and 
n—k tails (arranged in one particular order), then we should define p(w) = g*(1— q)"~*. 
Next, define P(A) = >0.,<4 p(w) for every event A C S. For example, let Az be the event 
that we get k heads and n — k tails (in any order). Note that |A,| = |R(H*T"~*)| = (%), 
and p(w) = q*(1 — q)"~* for each w € Ag. It follows that 


Pa) = (Tata 


We have not yet checked that P(S') = 1, which is one of the requirements in the definition 
of a probability measure. This fact can be deduced from the binomial theorem (discussed 


in §2.2), as follows. Since S is the disjoint union of Ag, Ai,..., An, we have 
P(S) = 3 el ge, 
k=0 : 


By the binomial theorem 2.14, the right side is (q+ [1 — q])” =1" =1. 


Summary 


We end each chapter by summarizing some of the main definitions and results discussed in 
the chapter. 


e Notation. Factorials: 0! = 1 and n!=nx (n-—1)x---x 1. 
Binomial coefficients: (2) = mow forO<k <n; (7) = 0 otherwise. 


Multinomial coefficients: Given n1,...,nzx > 0 and N = ny +---+ ng, ene ~) = 
N! ois 


ni!n2!---n,! 
Rearrangements: R(a}'---a;,") is the set of all words consisting of n; copies of a;. 


Basic Counting Rules. Sum Rule: If A1,...,A, are pairwise disjoint finite sets, then 
|A; U---U A, | = |A;|+---+|Agl. 

Union Rule: If A and B are arbitrary finite sets, then |AU B| = |A|+ |B] —|AN Bl. 
Difference Rule: If A C B and B is finite, then |B ~ A] =|B| —|A|. 

Product Rule: If Ai,..., A, are arbitrary finite sets, then |Ay x---x Ax] = |Aj|-...-|Aal- 
Bijection Rule: If there is a bijection f : A — B, then |A| = |B}. 


Counting Words. Let A be an n-letter alphabet. 

There are n* words of length k using letters from A. 

If the letters must be distinct, there are n!/(n — k)! words of length k <n. 
There are n! permutations of all the letters in A. 


There are eS) words in R(a;* +--+ az"). 


Counting Sets and Multisets. 
The number of k-element subsets of an n-element set is the binomial coefficient ae 
The total number of subsets of an n-element set is 2”. 


The number of k-element multisets using n available objects is Geuearae 


k,n-1 
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e Counting Functions. Let |X| =a and |Y| = b. 
There are b® functions mapping X into Y. 
For a < b, there are b!/(b— a)! injections from X to Y. 
If a = b, there are a! bijections from X onto Y. 


Counting Lattice Paths. There are (a) lattice paths from (0,0) to (a, b). 
There are a) lattice paths in R@ from the origin to (n1,n2,..., Na). 
Losers Nd 


The number of paths from (0,0) to (n,n) that never go below y = wx is the Catalan 


number 
1 2n 1 2n+1 2n 2n 
Ch = — = — . 
n+1i\n 2n+1 n n,n n+1,n-1 
This can be proved using a reflection bijection to convert paths ending at (n,n) that do 
go below y = x to arbitrary paths from (0,0) to (n+1,n—1). 


Compositions. A composition of n is an ordered sequence (a1,...,@) of positive integers 
that sum to n. There are 2”—! compositions of n. There are aes) compositions of n 
with k parts. 


Probability Definitions. A sample space is the set S of outcomes for some random ex- 
periment. An event is a subset of the sample space. When all outcomes in S' are equally 
likely, the probability of an event A is P(A) =|A|/|S|. The conditional probability of A 
given B is P(A|B) = P(ANB)/P(B), when P(B) > 0. Events A and B are independent 
iff P(AN B) = P(A)P(B). 


(ie 
Exercises 


1.88. (a) How many numbers between 1 and 1000 are divisible by 5 or 7? (b) How many 
such numbers are divisible by 5 or 7, but not both? 


1.89. How many three-digit numbers: (a) do not contain the digits 5 or 7; (b) contain the 
digits 5 and 7; (c) contain the digits 5 or 7; (d) contain 5 or 7, but not both? 


1.90. How many seven-digit phone numbers do not begin with one of the prefixes 1, 911, 
411, or 555? 


1.91. How many n-letter words over the alphabet {0,1} use both the symbols 0 and 1? 


1.92. (a) How many four-letter words w using an n-letter alphabet satisfy w; 4 wi+1 for 
¢ = 1,2,3? (b) How many of the words in (a) also satisfy w4 4 w1? 


1.93. A key for the DES encryption system is a binary word of length 56. A key for a 
permutation cipher is a permutation of the 26-letter English alphabet. Which encryption 
system has more keys? 


1.94. A key for the AES encryption system is a binary word of length 128. Suppose we try 
to decrypt an AES message by exhaustively trying every possible key. Assume six billion 
computers are running in parallel, where each computer can test one trillion keys per second. 
Estimate the number of years required for this attack to search the entire space of keys. 
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1.95. A pizza shop offers ten toppings. How many pizzas can be ordered with: (a) three dif- 
ferent toppings; (b) up to three different toppings; (c) three toppings, with repeats allowed; 
(d) four different toppings, but pepperoni and sausage cannot be ordered together? 


1.96. How many lattice paths from (0,0) to (7,5) pass through the point (2,3)? 


1.97. How many n-letter words contain: (a) only vowels; (b) no vowels; (c) at least one 
vowel; (d) alternating vowels and consonants; (e) two vowels and n — 2 consonants? (The 
vowels are A, E, I, O, and U.) 


1.98. How many four-digit even numbers contain the digit 5 but not the digit 2? 


1.99. A palindrome is a word w = w1we2::-wz that reads the same in reverse, i.e., 
W1W2°++WEr = Wr-++W2Ww 1. Count the number of k-letter palindromes using letters from 
an n-letter alphabet. 


1.100. Explicitly list the following objects: (a) all 4-letter words using the alphabet {0, 1}; 
(b) all permutations of {a, b,c, d}; (c) all 2-permutations of {u, v,w, x,y}; (d) all words in 
Ri(a*y?z"). 


1.101. Explicitly list the following objects: (a) all bijections from {1,2,3} to {7,7,k}; (b) 
all surjections from {1, 2,3} to {0,1}; (c) all injections from {a,b} to {c,d,e, f}. 


1.102. Explicitly list the following objects: (a) all subsets of {0, 1,2}; (b) all three-element 
subsets of {1, 2,3,4,5}; (c) all three-element multisets using the alphabet {a,b,c}. 


1.103. Explicitly list the following objects: (a) all compositions of 4; (b) all compositions 
of 7 with exactly three parts; (c) all lattice paths from (0,0) to (4,2); (d) all Dyck paths of 
order 4. 


1.104. Draw pictures of all compositions of 5. For each composition, determine the associ- 
ated word in {0,1}4 constructed in the proof of 1.41. 


1.105. How many lattice paths start at (0,0) and end on the line x + y =n? 
1.106. Let r be the bijection in the proof of 1.56. Compute 


r(NNEEEENNNNEEEENN) and r~'(NENEENNEEENEEEND). 


1.107. Draw all the non-Dyck lattice paths from (0,0) to (3,3) and compute their images 
under the reflection map r from the proof of 1.56. 


1.108. A bit is one of the symbols 0 or 1. Find the minimum k such that every printable 
character on a standard computer keyboard can be encoded by a distinct bit string of length 
exactly k. Does the answer change if we allow nonempty bit strings of length at most k? 


1.109. Ten lollipops are to be distributed to four children. All lollipops of the same color 
are considered identical. How many distributions are possible if (a) all lollipops are red; (b) 
all lollipops have different colors; (c) there are four red and six blue lollipops? (d) What are 
the answers if each child must receive at least one lollipop? 


1.110. Given a positive integer n, let the prime factorization of n be n = p{'p5?---p,*, 


where each e; > 0 and the p; are distinct primes. How many positive divisors does n have? 
How many divisors does n have in Z? 
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1.111. (a) Given k and N, count the number of weakly increasing sequences (71; < ig < 
+++ Sip) with 1 <i; < N for all j. (b) Count the number of strictly decreasing sequences 
(41 > ig > +--+ > %) with 1 <i; < N for all j. (c) For a fixed choice of k, count the number 
of permutations w of N objects such that 


Wr < We < +++ << We > Weg < Wep2 <s++ < WN. (1.2) 
(d) How many permutations satisfy (1.2) for some k < N? 


1.112. Euler’s ¢ Function. For each n > 1, let ®(n) be the set of integers k between 
1 and n such that gcd(k,n) = 1, and let ¢(n) = |®(n)|. (a) Compute ®(n) and ¢(n) for 
1 <n < 12. (b) Compute ¢(p) for p prime. (c) Compute ¢(p*) for p prime and e > 1. 
(Exercise 1.150 shows how to compute ¢(n) for any n.) 


1.113. (a) How many 4-element subsets of {1,2,...,11} contain no two consecutive inte- 
gers? (b) Given d,k,n, how many k-element subsets S of {1,2,...,n} are such that any 
two distinct elements of $ differ by at least d? 


1.114. (a) How many anagrams of ‘MISSISSIPPI’ are there? (b) How many of these ana- 
grams begin and end with P? (c) In how many of these anagrams are the two P’s adjacent? 
(d) In how many of these anagrams are no two I’s adjacent? 


1.115. A two-to-one function is a function f : X — Y such that for every y € Y, there exist 
exactly two elements 21,22 € X with f(z.) = y = f(#2). How many two-to-one functions 
are there from a 2n-element set to an n-element set? 


1.116. A monomial in N variables is a term of the form ap ph? . Eee where each k; > 0. 


The degree of this monomial is kj + kp +---+ ky. How many monomials in N variables 
have degree (a) exactly d; (b) at most d? 


1.117. How many multisets (of any size) can be formed from an n-letter alphabet if each 
letter can appear at most k times in the multiset? 


1.118. Two fair dice are rolled. Find the probability that: (a) the same number appears 
on both dice; (b) the sum of the numbers rolled is 8; (c) the sum of the numbers rolled is 
divisible by 3; (d) the two numbers rolled differ by 1. 


1.119. In blackjack, you have been dealt two cards from a shuffled 52-card deck: 90 and 
6. Find the probability that drawing one more card will cause the sum of the three card 
values to go over 21. (Here, an ace counts as 1 and other face cards count as 10.) 


1.120. Find the probability that a random 5-letter word: (a) has no repeated letters; (b) 
contains no vowels; (c) is a palindrome. 


1.121. A company employs ten men (one of whom is Bob) and eight women (one of whom 
is Alice). A four-person committee is randomly chosen. Find the probability that the com- 
mittee: (a) consists of all men; (b) consists of two men and two women; (c) does not have 
both Alice and Bob as members. 


1.122. A fair coin is tossed ten times. (a) Find the probability of getting exactly seven 
heads. (b) Find the probability of getting at least two heads. (c) Find the probability of 
getting exactly seven heads, given that the number of heads was prime. 


1.123. A fair die is tossed ten times. What is the probability that, in these ten tosses, 1 
comes up 5 times, 3 comes up 2 times, and 6 comes up 3 times? 
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1.124. Ten balls are drawn (without replacement) from an urn containing 40 red, 30 blue, 
and 30 white balls. (a) What is the probability that no blue balls are drawn? (b) What is 
the probability of getting 4 red, 3 blue, and 3 white balls? (c) What is the probability that 
all ten balls have the same color? (d) Answer the same questions assuming the balls are 
drawn with replacement. 


1.125. Urn A contains two red balls and three black balls. Urn B contains one red ball and 
four black balls. Urn C contains four red balls and one black ball. A ball is randomly chosen 
from each of the three urns. Find the probability that all three balls are the same color. 


1.126. Consider the three urns from 1.125 (with five balls in each urn). An urn is selected 
at random, and then one ball is selected from that urn. What is the probability that: (a) 
the ball is black, given that urn B was chosen; (b) the ball is black; (c) urn B was chosen, 
given that the ball was black? 


1.127. A fair coin is tossed three times. (a) Describe the sample space. (b) Consider the 
following events. A: second toss is tails; B: first and last tosses agree; C: all tosses are the 
same; D: the number of heads is odd. Describe each event as a subset of the sample space. 
(c) Which pairs of events from {A, B,C, D} are independent? (d) Is the triple of events 
A, B, D independent? Explain. 


1.128. Let the prime factorization of n! be p{'pS?---p;*. Prove that e; = 07°, |n/p* J. 
(The notation |x| denotes the greatest integer not exceeding the real number x.) Hence 
determine the number of trailing zeroes in the decimal notation for 100!. 


1.129. Find a bijection on Comp,, that maps compositions with k parts to compositions 
with n+ 1-—k parts for all k. 


1.130. (a) How many numbers between one and one million contain the digit 7? (b) If one 
writes down the numbers from one to one million, how often will one write the digit 7? (c) 
What are the answers to (a) and (b) if 7 is replaced by 0? 


1.131. A relation from X to Y is any subset of X x Y. Suppose X has n elements and Y 
has k elements. (a) How many relations from X to Y are there? (b) How many relations 
R satisfy the following property: for each y € Y, there exists at most one x € X with 
(x,y) € R? 


1.132. Suppose we play five-card poker using a 51-card deck in which the queen of spades 
has been removed. Compute the probabilities of the poker hands in Table 1.4 relative to 
this deck. 


1.133. Suppose we play five-card poker using two identical decks mixed together. Com- 
pute the probabilities of the poker hands in Table 1.4 in this situation. Also compute the 
probability of a “five-of-a-kind” hand, which is a poker hand H such that |V(H)| = 1. 


1.134. Consider a five-card poker hand dealt from a 52-card deck. (a) What is the prob- 
ability that the hand contains only red cards (i.e., hearts and diamonds)? (b) What is the 
probability that the hand contains exactly two eights? (c) What is the probability that the 
hand contains only numerical cards (i.e., ace, jack, queen, and king may not appear)? 


1.135. Consider a five-card poker hand dealt from a 52-card deck. (a) What is the prob- 
ability that the hand is a flush, given that the hand contains no clubs? (b) What is the 
probability that the hand contains at least one card from each of the four suits? (c) What 
is the probability of getting a two-pair hand, given that at least two cards in the hand have 
the same value? 
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1.136. Let K be the event that a five-card poker hand contains the card KQ. Find the 
conditional probability of each event in Table 1.4, given kK. Which of these events are 
independent of kK? 


1.137. Texas Hold ’em. In a popular version of poker, a player is dealt an ordered 
sequence of seven distinct cards from a 52-card deck. We model this situation using the 
sample space 

S= {(Cy, Ca, ae .,C7) : C; € Deck, C; # C; for 7 # jh. 


(The last five cards in this sequence are “community cards” shared with other players. In 
this exercise we concentrate on a single player, so we ignore this aspect of the game.) The 
player uses these seven cards to form the best possible five-card poker hand (cf. Table 1.4). 
For example, if we were dealt the hand 


(49, 7d, 30, 9de, Sie, Coe, de), 


we would have a flush (the five club cards) since this beats the straight (3,4,5,6,7 of various 
suits). (a) Compute |S]. (b) What is the probability of getting 4-of-a-kind? (c) What is 
the probability of getting a flush? (d) What is the probability of getting 4-of-a-kind, given 
C, = 39 and Cp = 3@? (e) What is the probability of getting a flush, given C, = 5 and 
Cy = 90? 


1.138. Prove that the following conditions are equivalent for any sets A and B: (a) A C B; 
(b) ANB=A; (c) AUB=B;(d) A~ B=9. 


1.139. Prove that if A and B are unequal nonempty sets, then Ax B#ABx A. 
1.140. Use the binary union rule 1.4 to prove that for all finite sets X,Y, Z, 
IXUYUZ|=|X|+]Y|4+|Z)-|XNY|-|XNZ|-|YNZ\+|XnNYN Z|. 


1.141. (a) For fixed k, prove that limp. aon = 1. (b) Give a probabilistic interpre- 
tation of this result. 


1.142. Let f : P(X) — {0,1}” be the bijection in 1.38. Given two words v,w € {0,1}", 
define words vA w, v Vw, and 7 by setting (v Aw); = min(w, wi), (v Vw); = max(vi, wi), 
and (7v); = 1— uy for all i < n. Prove that for all 5,T C X, f(SNT) = AS JA FL), 
f(SUT) = f(5) V f(T), f(X ~ 8) =F(5), FO) = 00---0, and f(X) = 11-1. 


1.143. Let A, B,C be events in a probability space S. Assume A and C are independent, 
and B and C are independent. (a) Give an example where AUB and C are not independent. 
(b) Prove that AU B and C are independent if A and B are disjoint. (c) Must AM B and 
C’ be independent? Explain. 


1.144. Properties of Injections. Prove the following statements about injective func- 
tions. (a) If f: X > Y andg:Y — Z are injective, then go f is injective. (b) If go f is 
injective, then f is injective but g may not be. (c) f : X — Y is injective iff for all W and 
allg,h:W > X, fog=foh implies g =h. 


1.145. Properties of Surjections. Prove the following statements about surjective func- 
tions. (a) If f: X > Y andg:Y — Z are surjective, then go f is surjective. (b) If go f is 
surjective, then g is surjective but f may not be. (c) f : X — Y is surjective iff for all Z 
and allg,h: Y — Z,go f =ho f implies g =h. 
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1.146. Sorting by Comparisons. Consider a game in which player 1 picks a permutation 
w of n letters, and player 2 must determine w by asking player 1 a sequence of yes/no 
questions. (Player 2 can choose later questions in the sequence based on the answers to 
earlier questions.) Let K(n) be the minimum number such that, no matter what w player 
1 chooses, player 2 can correctly identify w after at most A(n) questions. (a) Prove that 
(n/2) log(n/2) < [log,(n!)] < K(n). (b) Prove that K(n) = [log,(n!)] for n < 5. (c) Prove 
that (b) still holds if we restrict player 2 to ask only questions of the form “is w; < w,;?” 
at each stage. (d) What does (a) imply about the length of time needed to sort n distinct 
elements using an algorithm that makes decisions by comparing two data elements at a 
time? 


1.147. (a) You are given twelve seemingly identical coins and a balance scale. One coin is 
counterfeit and is either lighter or heavier than the others. Describe a strategy that can be 
used to identify which coin is fake in only three weighings. (b) If there are thirteen coins, 
can the fake coin always be found in three weighings? Justify your answer. (c) If there are 
N coins (one of which is fake), derive a lower bound for the number of weighings required 
to find the fake coin. 


1.148. Define f : N x NN? by f(a,b) = 2%(2b +1). Prove that f is a bijection. 
1.149. Define f : Nx NN by f(a,b) = ((a+b)? + 3a + b)/2. Prove that f is a bijection. 


1.150. Chinese Remainder Theorem. In this exercise, we write “a mod k” to denote 
the unique integer b in the range {1,2,...,k} such that & divides (a — b). Suppose m and 
n are fixed positive integers. Define a map 


f :{1,2,...,mn}— {1,2,...,m} x {1,2,...,n} by setting f(z) = (z mod m, z mod n). 


(a) Show that f(z) = f(w) iff lem(m,n) divides z — w. (b) Show that f is injective iff 
gcd(m,n) = 1. (c) Deduce that f is a bijection iff gcd(m,n) = 1. (d) Prove that for 
gcd(m,n) = 1, f maps ®(mn) bijectively onto ®(m) x ®(n), and hence ¢(mn) = ¢(m)¢(n). 
(See 1.112 for the definition of & and ¢.) (e) Suppose n has prime factorization p{! ---p,*. 
Prove that ¢(n) = n[]h_,(1 — 1/pi). 


1.151. Bijective Product Rule. For any positive integers m,n, define 
g:{0,1,...,m—1} x {0,1,...,2-—1} > {0,1,...,mn-—1} 
by setting g(t, 7) = ni + j. Carefully prove that g is a bijection. 


1.152. Bijective Laws of Algebra. (a) For all sets X,Y, Z, prove that X UY =YUX, 
(XUY)UZ=XU(YUZ), and X U9 = X =OUX. (b) For all sets X,Y, Z, define 
bijections f: X x Y > Y x X,g:(X x Y)x Z—> X x (Y x Z), and (for Y, Z disjoint) 
h:X x (YUZ) — (X x Y)U(X x Z). (c) Use (a), (b), and counting rules to deduce the 
algebraic laws r+ y =yta, (e@+y)t+z2=2a4+(y+z2),7+0=2=0+29, ry = yo, 
(xy)z = x(yz), and x(y +z) = xy+ az, valid for all integers x,y,z > 0. 


1.153. Bijective Laws of Exponents. (a) If X,Y,Z are sets with YZ = Q, define a 
bijection from ¥Y2X to YX x 4X. (b) If X,Y, Z are any sets, define a bijection from (YX) 
to ¥*4X. (c) By specializing to finite sets, deduce the laws of exponents x¥+* = x¥x* and 
(x¥)* = x for all integers x,y, z > 0. 


1.154. Let X be any set (possibly infinite). Prove that there exists an injection g : X — 
P(X), but there exists no surjection f : X — P(X). Conclude that |X| < |P(X)|, and in 
particular n < 2” for all n > 0. 
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1.155. Show that the set of functions “{0,1} (which can be viewed as the set of infinite 
sequences of zeroes and ones) is uncountably infinite. 


1.156. Suppose X and Y are sets (possibly infinite), f : X — Y is any function, and 
g:Y — X is an injective function. (a) Show that there exist sets A,B,C, D such that 
X is the disjoint union of A and B, Y is the disjoint union of C and D, C = f[A] = 
{f(x) : a € A}, and B = g[D] = {g(y): y € D}. (Let Z = X ~ g[Y] and h=Qo f; then 
let A be the intersection of all subsets U of X such that ZU h[U] C U.) (b) Deduce the 
Schréder-Bernstein Theorem from (a). 


1.157. A sample space S consists of 25 equally likely outcomes. Suppose we randomly 
choose an ordered pair (A, B) of events in S. (a) Find the probability that A and B are 
disjoint. (b) Find the probability that A and B are independent events. 


DT 


Notes 


General treatments of combinatorics may be found in the textbooks [1, 10, 13, 16, 21, 23, 
26, 60, 113, 115, 127, 131, 134]. For elementary accounts of probability theory, see [68, 93]. 
Two advanced probability texts that include measure theory are [11, 30]. More information 
on the theory of cardinality for infinite sets may be found in [66, 95). 
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Combinatorial Identities and Recursions 


This chapter begins with a discussion of the generalized distributive law and its conse- 
quences, which include the multinomial and binomial theorems. We then study algebraic 
and combinatorial proofs of identities involving binomial coefficients, factorials, summations, 
etc. We also introduce recursions, which provide ways to enumerate classes of combinatorial 
objects whose cardinalities are not given by closed formulas. We use recursions to obtain 
information about more intricate combinatorial objects including set partitions, integer 
partitions, equivalence relations, surjections, and lattice paths. 


DT 


2.1 Generalized Distributive Law 


Suppose we have a product of several factors, where each factor consists of a sum of some 
terms. How can we simplify such a product of sums? The following example suggests the 
answer. 


2.1. Example. Suppose A, B,C,T,U,V are n x n matrices. Let us simplify the matrix 
product 
(C+V)(A+U)(B+C+T). 


Using the distributive laws for matrices several times, we first compute 
(C+ V)(A+U) =C(A+U)+V(A4+U) =CA+CU+VA+4+VU. 


Now we multiply this on the right by the matrix B+ C+. Using the distributive laws 
again, we obtain 


CA(B+C+T)+CU(B+C4+T)+VA(B+C+T)+VU(B+C+T) 


= CAB+CAC+CAT+CUB+CUC+CUT+VAB+VAC+VAT+VUB+VUC+VUT. 


Observe that the final answer is a sum of many terms, where each term can be viewed as a 
word drawn from the set of words {C,V} x {A,U} x {B,C,T}. We obtain such a word by 
choosing a first matrix from the first factor C+ V, then choosing a second matrix from the 
second factor A+ U, and then choosing a third matrix from the third factor B+ C+ T. 
This sequence of choices can be done in 12 ways, and accordingly there are 12 terms in the 
final sum. 


The pattern in the previous example holds in general. Intuitively, to multiply together 
some factors, each of which is a sum of some terms, we choose one term from each factor 
and multiply these terms together. Then we add together all possible products obtained in 
this way. We will now give a rigorous proof of this result, which ultimately follows from the 
distributive laws for a (possibly non-commutative) ring. For convenience, we now state the 
relevant definitions from abstract algebra. Readers unfamiliar with abstract algebra may 
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replace the abstract rings used below by the set of n x n matrices with real entries (which 
is a particular example of a ring). 


2.2. Definition: Rings. A ring consists of a set R and two binary operations + (addition) 
and - (multiplication) with domain R x R, subject to the following axioms. 


Vae,yEeR, c+yeR (closure under addition) 
Ve,y,2e R, ot ly +e) = (ety) +2 (associativity of addition) 
Va,yEeR, «c+y=yte (commutativity of addition) 
FORE R Vee R,x+0R=x=0R4+2 (existence of additive identity) 
Va € R,i-a«we R,x+(-x) =0r = (—x)+2 (existence of additive inverses) 
( 
( 
( 
( 


Va,yeER, «-yerR closure under multiplication) 
Va,y,z€R, x-(y-z)=(a@-y)-z associativity of multiplication) 
Tree R Vee Rx lep=x=l1eg-e existence of multiplicative identity) 
Va,y,z€R, e-(ytz)=a-y+au-z left distributive law) 

Va,y,zE€R, (wty)-z=a-z24+y-2 (right distributive law) 


We often write xy instead of x- y. R is a commutative ring iff R satisfies the additional 
axiom 

Va,y eR, ry = yx (commutativity of multiplication). 
2.3. Definition: Fields. A field is a commutative ring F' with 1p 4 Or such that every 
nonzero element of F’ has a multiplicative inverse: 


Vee Fix A0rp > dye Fiay=lr=ye. 


Let R be aring, and suppose 21, %2,...,2, € R. Because addition is associative, we can 
unambiguously write a sum like 21 + 2 + 43 +---+2, without parentheses (see 2.148). 
Similarly, associativity of multiplication implies that we can write the product 71 2%2--- Xp 
without parentheses. Because addition in the ring is commutative, we can permute the 
summands in a sum like 7] +%2+---+2, without changing the answer. More formally, for 
any bijection f : {1,2,...,n}— {1,2,...,n}, we have 


Gea ape) + PB = We Peek te CR 


(see 2.149). It follows that if {a; : 7 € I} is a finite indexed family of ring elements, then the 
sum of all these elements (denoted )7,-, 7;) is well defined. Similarly, if A is a finite subset 
of R, then }7,-4 is well defined. On the other hand, the products [J,-,; 2; and [],¢42 
are not well defined (for R non-commutative) unless we specify in advance a total ordering 
on J and A. 

Now we are ready to derive the general distributive law for non-commutative rings. The 
idea is to keep iterating the left and right distributive laws to obtain successively more 
general formulas. We divide the proof into a sequence of lemmas. 


2.4. Lemma. Let R be a ring, n a positive integer, and x, y1,y2,.--,Yn € R. Then 
Cyr +y2te+Yn) = tyr teyete + 8Yns (Yr + yet +Yn)a = Yet yee +--+ Yn. 


Proof. We prove the first equation by induction on n. The case n = 1 is immediate, while 
the case n = 2 is the left distributive law for R. Now assume that 2(y1 + yo +--+: + Yn) = 
ry1 + ryo +--++2Yn is known for some n > 2; let us prove the corresponding formula for 
n+1. In the left distributive law for R, let y= yi +---+ yn and z = yn+1. Using this and 
the induction hypothesis, we calculate 


EYr+ Yn +Ynt1) = U(Y+2) = wy tae = e(Yrte+Yn)+eY nt = LY t + LYn FLY n+ 


This proves the first equation. The second is proved similarly from the right distributive 
law. O 
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Now suppose « € R and {y; : i € I} is a finite indexed family of elements of R. The 
results in the previous lemma can be written as follows: 


x: (= “) = (ea); (2.1) 


tel ier 


tel tel 


2.5. Lemma. Let R be a ring, and suppose {u; : i € I} and {v; : 7 € J} are two finite 
indexed families of ring elements. Then 


(x»), Soap =: SS. Gy (2.3) 


tel jeJ (4,j)ELxX J 


Proof. Applying (2.2) with «= >°, ,v; € Rand y; = u; for all i, we obtain first 


jeu 
( ) «) . ; VU; = ) Uy° ) VU; 
i€l jed ie Ged 


Now, for each i € J, apply (2.1) with = u; and }),-; yi replaced by }7,.., vj to obtain 


jeJ 
do ee doa) =D [Soa - ey) 
i€I jed ier \jes 


Finally, since addition in R is commutative, the iterated sum in the last formula is equal to 


the single sum 
S2 (ui v;) 


(4,j)ELxX J 
over the new index set I x J. The lemma follows. oO 
2.6. Theorem (Generalized Distributive Law). Suppose RF is a ring, l1,...,I, are 


finite index sets, and {xx,i, : ix € I,} are indexed families of ring elements for 1 < k < n. 
Then 


be na) : (x: ran) ara & a => De (L114 * X2,i0 Let Unwin): 
neh i2€ Ig in €In (i1,-pin JEL XX In 
(2.4) 

We can also write this as 

n n 

Il ( = nui) 5S (11 ou | (2.5) 

k=1 \ineln (digics in EI, xX++XIn \R=1 
provided we remember that the order of the factors in ies Uk = UjU2Q°** Un, is crucial. 


Proof. We use induction on n. There is nothing to prove if n = 1, and the case n = 2 was 
proved in the previous lemma. Now assume the result holds for some n > 2; we prove the 
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result for n + 1 factors. Using [[7} vx = (Tq ve) « Un4i, then the induction hypothesis 
for n factors, then the result for 2 factors, we compute 


n+1 n 
TT do ten = (TD do tei) (dD tettinns 


k=1i,€l, k=1ipn€ly, in+1€In41 


n 
_ S II Ukix, | * : Unt+lingi 


(i1,.-,in JET XX In K=1 in+1€In+1 


n 
= S II Ukin | * Untying | - 
((i1,--s¢n) sing )E (a XX In) X Inq k=1 


By commutativity of addition, the final expression is equal to 


n+1 


De [] 2%: 


(a1 jséng 1 El XX Ing K=1 
This completes the induction. O 
Here is a formula that follows from the generalized distributive law. 
2.7. Theorem. If y1,...,Yn,21,---;2n are elements of a commutative ring R, then 


[[@ +) = S- I] « [[ - 
k=1 


SC{1,2,....n}kES kZS 


Proof. Write tp9 = yx and tp) = 2 for 1 <k <n, and let h = Ig =--- =I, = {0,1}. 
Using 2.6 gives 


[[@ +2) = [[ xo +21) = S- II Lhe, - 
k=1 k=1 


(i1,--54n )E{O,1}” K=1 


Now, we can use the bijection in 1.38 to convert the sum over binary words in {0,1}” 
to a sum over subsets S C {1,2,...,n}. Suppose (¢1,...,%n) corresponds to S under the 
bijection. Then k € S iff ¢, = 1 iff a, 3, = 2%, while k ZS iff i, = 0 iff x, i, = yx. It follows 
that the summand indexed by S is 


n 
[Denn = TD ve TT een = [1 o-. 
k=1 kvip=l kvip=0 kes kgs 


Note that the first equality here used the commutativity of R. O 


2.2. Multinomial and Binomial Theorems 


We now deduce some consequences of the generalized distributive law. In particular, we 
derive the non-commutative and commutative versions of the multinomial theorem and the 
binomial theorem. 
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2.8. Theorem. Suppose R is a ring and Aj,..., A, are finite subsets of R. Then 
wie€ Ay w2E Ao wnEAn (w1,wa,' ,Wn)EA1LX A2X +X An 


Proof. In 2.6, choose the index sets I, = A,, and define r,;, = in € Ap C R for each 
k <n and each iz € Jy. Then >>, cy, Lkwe = Yow, cA, Wk for each k, and ean = 
W1W2+++Wn. Thus the formula in the theorem is a special case of (2.4). O 


To emphasize the combinatorial nature of the previous result, we can write it as follows: 


(x m)-( Om) (Om) = a - ae 
wi€Al wae Ag wnEAn words w€ A,X A2X-:X An 


Intuitively, we simplify a given product of sums by choosing one letter w; from each factor, 
concatenating (multiplying) these letters to get a word, and adding all the words obtainable 
in this way. 


2.9. Non-Commutative Multinomial Theorem. Suppose RF is a ring, n € N, and 
A= {z,...,2s} C R. Then 


(21 + 2g +++ +25)" = S- W1W2°°* Wn. 


Proof. Take Ay = Ag =--- = A, = A in 2.8. O 


2.10. Example. If A and B are two n x n matrices, then 


(A+B)? = AAA+AAB+ABA+ABB+4+ BAA+ BAB+BBA+ BBB. 
If A, B,...,Z are 26 matrices, then 


(A+B+-.-4+Z)* = AAAA+AAAB+ AABA+-:--+ ZZZY + ZZZZ 
= the sum of all 4-letter words. 


2.11. Remark. Our statement of the non-commutative multinomial theorem tacitly as- 
sumed that z1,...,2; were distinct ring elements. This assumption can be dropped at the 
expense of a slight notation change. More precisely, if {z; : i € I} is a finite indexed family 
of ring elements, then it follows from 2.6 that 


n 
(3 s] = ) Quy Sweet ees 


ie 
Similar comments apply to the theorems below. 


2.12. Commutative Multinomial Theorem. Suppose FR is a ring, n € N, and 
Z1,-..,23 € R are elements of R that commute (meaning 2:2; = z;z; for all 7,7). Then 


n 
n n n n 
(21 + 22 +++ +25) S (, a 9, haa 
nitnet+ns=n PEI eS 


The summation here extends over all ordered sequences (n1,n2,...,%5) of nonnegative 
integers that sum to n. 
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Proof. Let A = {2,29,...,Z.} C R. Let X be the set of all ordered sequences 
(n1,n2,...,,) of nonnegative integers that sum to n. The non-commutative multinomial 
theorem gives 


(ap age 25)" = ‘Ss W1W2° +" Wn. 


The set A” of n-letter words over the alphabet A is the disjoint union of the sets 
R(zy* 257 +--+ 2"), as (N1,...,,) ranges over X. By commutativity of addition, we therefore 


(2, + zg +--+ +25)" = ‘> x W1W2°** Wn: 


(M1 y.Ms)EX WER(z} 1-298) 


Now we use the assumption that all the z;’s commute with one another. This assumption 
allows us to reorder any product of z;’s so that all z1’s come first, followed by the z2’s, etc. 


Given w € R(z7!---z"s), reordering the letters of w gives 
1 s ’ g g 


n1 ne Ns 


W1W2°** Wn = 21% 1 Ze” 


Thus, 


COT) a Sa See 


(m1,.-,2s)EX wWER(z} 1-228) 


n 8 
= SZ ae sia S- ta 


(n1,...,Ns)EX WER (zp b2es) 


The inner sum is |R(z]'1--- 2?) 


= ae ae ) (or more precisely, the sum of this many 


copies of lz). Thus, we obtain the formula in the statement of the theorem. O 
2.13. Example. If cy = yx and xz = za and yz = zy, then 
(eotyt zy =a3 + y% 4 23 4+ 3a7y + 8x72 + By?z + Bry? + Baz” + By2” + bryz. 


2.14. Commutative Binomial Theorem. Suppose R is a ring, n € N, and x,y € R are 
ring elements such that xy = yx. Then 


Let n1 = k; note that the possible values of k are 0,1,...,n. Once n; has been chosen, n2 
is uniquely determined as ng = n—n, = n—k. Also, Ge — Cy, so the formula becomes 


(e+y)" = a G ak yr, oO 
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2.15. Example. What is the coefficient of z’ in (2z—5)°? We apply the binomial theorem 
taking x = 2z and y = —5 and n = 9. We have 


9 
(2z-5)°= 5° @ (22)" (5)? 


k=0 
The only summand involving z” is the k = 7 summand. The corresponding coefficient is 


& 2"(—5)? = 115, 200. 


2.16. Remark. If r is any real number and « is a real number such that || < 1, there 
exists a power series expansion for (1+ 2)” that is often called the generalized binomial 
formula. This power series is discussed in 7.68. 


(ie 
2.3. Combinatorial Proofs 


Consider the problem of proving an identity of the form A = B, where A and B are 
formulas that may involve factorials, binomial coefficients, powers, etc. One way to prove 
such an identity is to give an algebraic proof using tools like the binomial theorem or other 
algebraic techniques. Another way to prove such an identity is to find a combinatorial proof. 
A combinatorial proof establishes the equality of two formulas by exhibiting a set of objects 
whose cardinality is given by both formulas. Thus, the main steps in a combinatorial proof 
of A = B are as follows. 


e Define a set S of objects. 


e Give a counting argument (using the sum rule, product rule, bijections, etc.) to prove 
that |S| = A. 


e Give a different counting argument to prove that |.S| = B. 


e Conclude that A= B. 


We now give some examples illustrating this technique and its variations. 


2.17. Theorem. For all n €N, 


Proof. We give an algebraic proof and a combinatorial proof. 
Algebraic Proof. By the binomial theorem, we know that 


(e+y)" = 3 @ akym-* (a, y ER). 


Setting « = y = 1 yields the desired formula. 
Combinatorial Proof. Fix n € N. Let S be the set of all subsets of {1,2,...,n}. As shown 
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earlier, |S| = 2” since we can build a typical subset by either including or excluding each 
of the n available elements. On the other hand, note that S is the disjoint union 


S=SjUS,U-:-US», 


where $j, consists of all k-element subsets of {1,2,...,n}. As shown earlier, |Sx| = (1). By 
the sum rule, we therefore have 


Isl= sd = >> (2). 


k=0 k=0 


Thus, 


»=si=> (7). 


2.18. Theorem. For all integers n,k with 0 < k <n, we have 


Proof. Again we give both an algebraic proof and a combinatorial proof. 
Algebraic Proof. Using the explicit formula for binomial coefficients involving factorials, we 


calculate is A . 7 
(7) = meat we ec (na) 


Combinatorial Proof. For this proof, we will define two different sets of objects. Fix n 
and k. Let S be the set of all k-element subsets of {1,2,...,n}, and let T be the set of 
all (n — k)-element subsets of {1,2,...,n}. We have already shown that |S| = (j') and 
|T| = (,,",,). We complete the proof by exhibiting a bijection ¢ : S — T, which shows 
that |S| = |T|. Given A € S, define ¢(A) = {1,2,...,n} ~ A. Since A has k elements, 
(A) has n — k elements and is thus an element of T. The inverse of this map is the map 
o' :T > S given by ¢'(B) = {1,2,...,n} ~ B for B € T. Note that ¢ and ¢’ are both 
restrictions of the “set complement” map I : P({1,2,...,n}) — P({1,2,...,n}) given by 
I(A) = {1,2,...,n} ~ A. Since Io I is the identity map on P({1,2,...,n}), it follows that 
g’ is the two-sided inverse of ¢. oO 


2.19. Theorem. For 0<k <n, 


Proof. In terms of factorials, we are trying to prove that 


” (n!)? _ (2n)! 
os (k!)2((n—k))2 nln 


k=0 


An algebraic proof of this formula is not evident. So we proceed to look for a combinatorial 
proof. 

Define S to be the set of all n-element subsets of X = {1,2,...,2n}. This choice of 
S was motivated by our knowledge that |S| = (4 which is the right side of the desired 


n 
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(n,n) 


(0,0) 


FIGURE 2.1 
A combinatorial proof using lattice paths. 


identity. To complete the proof, we count S$ in a new way. Let X; = {1,2,...,n} and 
Xo ={n+4+1,...,2n}. For0<k <n, define 


Sp ={AES:|ANXy| =k and |AN Xo] =n — &}. 


Evidently, S is the disjoint union of the $j,’s, so that |S] = )°¢_9 |Sk| by the sum rule. To 
compute |.S;,|, we build a typical object A € S; by making two choices. First, choose the 
k-element subset AMX, in any of () ways. Second, choose the (n — k)-element subset 


AN X2 in any of (,,",,) = (%) ways. We see that |S] = (eg by the product rule. Thus, 
[S| = >5 8 (2c completing the proof. Oo 


One can often find different combinatorial proofs of a given identity. For example, here 
is an alternate proof of the previous identity using lattice paths. Let S be the set of all 
lattice paths from the origin to (n,n); we know that |S| = Co = (°"). For0<k <n, let 
S;, be the set of all paths in S' passing through the point (k,n — k) on the line x+y =n. 
Every path in S must go through exactly one such point for some k between 0 and n, so 
S is the disjoint union of $9, $1,...,5,. See Figure 2.1. To build a path in Sx, first choose 
a path from (0,0) to (k,n — k) in any of Gap eas = (11) ways. Second, choose a path from 
(k,n —k) to (n,n). This is a path in a rectangle of width n — k and height n— (n—k) =k, 
so there are (oe) = (3) ways to make this second choice. By the sum and product rules, 


we conclude that a 2 
n 
Is}= oil = >> @ 
k=0 0 


k= 


Lattice paths can often be used to give elegant, visually appealing combinatorial proofs of 
identities involving binomial coefficients. We conclude this section with two more examples 
of this kind. 


2.20. Theorem. For all integers a > 0 and b > 1, 


atb (ee peat 
Ce): 


k=0 
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(a,b) 


(0,0) 


FIGURE 2.2 
Another combinatorial proof using lattice paths. 


Proof. Let S be the set of all lattice paths from the origin to (a,b). We already know that 
|S| = (ey: For 0 << k <a, let S; be the set of paths 7 € S such that the last north step of 
m lies on the line « = k. See Figure 2.2. We can build a path in S; by choosing any lattice 
path from the origin to (k,b— 1) in Ga) ways, and then appending one north step and 
a—k east steps. Thus, the required identity follows from the sum rule. If we classify the 
paths by the final east step instead, we obtain the dual identity (for a > 1, b > 0) 


b ‘ 
a+b\  wnfa-1t+j 
eee) 


This identity also follows from the previous one by the known symmetry ( 


ie eee WS 


2.21. Theorem (Chu-Vandermonde Identity). For all integers a,b,c > 0, 


at+tb+ct+l1 fh bN faa kee 
Ce a ae 


k=0 


Proof. Let S be the set of all lattice paths from the origin to (a,b + c+ 1). We know that 


[S| = eae For 0 < k < a, let S, be the set of paths 7 € S that contain the north 


step from (k,b) to (k,b+ 1). Since every path in 7 must cross the line y = b+ 1/2 by 
taking a north step between the lines x = 0 and x = a, we see that S is the disjoint union 
of So, S1,...,Sq. See Figure 2.3. Now, we can build a path in S; as follows. First, choose 
a lattice path from the origin to (k,b) in Gea ways. Second, append a north step to this 
path. Third, choose a lattice path from (k,b +1) to (a,b +c+1). This is a path ina 


rectangle of width a — k and height c, so there are Ca) ways to make this choice. Thus, 


|Si.| = Gy) “1. (ee) by the product rule. The desired identity now follows from the sum 
rule. O 


We remark that 2.20 is the special case of 2.21 obtained by setting c = 0 and replacing 
b by b- 1. 
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(a,b+c+1) 


(k,b+1) 


ety ake a Gg Co a y=b+1/2 


(0,0) x= 


FIGURE 2.3 
A third combinatorial proof using lattice paths. 


DS 


2.4 MRecursions 


Suppose we are given some unknown quantities a9,@1,..-,@n,.--. A closed formula for 
these quantities is an expression of the form a, = f(n), where the right side is some explicit 
formula involving the integer n but not involving any of the unknown quantities a;. In 
contrast, a recursive formula for a, is an expression of the form ay, = f(n, a0, 41,.--,;@n—1), 
where the right side is a formula that does involve one or more of the unknown quantities 
a;. A recursive formula is usually accompanied by one or more initial conditions, which are 
non-recursive expressions for ag and possibly other a;’s. Similar definitions apply to doubly 
indexed sequences a,x. 

Now consider the problem of counting sets of combinatorial objects. Suppose we have 
several related families of objects, say To,71,...,7n,..-. We think of the index n as somehow 
measuring the size of the objects in T;,. Sometimes we can give an explicit description of 
the objects in T,, (using the sum and product rules) leading to a closed formula for |T;,|. 
In many cases, however, it is more natural to give a recursive description of T,,, which tells 
us how to construct a typical object in T,, by assembling smaller objects of the same kind 
from the sets To,...,Tn—1. Such an argument leads to a recursive formula for |T,,| in terms 
of one or more of the quantities |To|,...,|Zn—1|. If we suspect that |T,,| is also given by a 
certain closed formula, we can then prove this fact using induction. We use the following 
example to illustrate these ideas. 


2.22. Theorem: Recursion for Subsets. For each integer n > 0, let T;, be the set 
of all subsets of {1,2,...,n}, and let a, = |T;,|. We derive a recursive formula for ap, as 
follows. Suppose n > 1 and we are trying to build a typical subset A € T;,. We can do this 
recursively by first choosing a subset A’ C {1,2,...,n—1} in any of |T,-1| = an_1 ways, 
and then either adding or not adding the element n to this subset (2 possibilities). By the 
product rule, we conclude that 


Qn = An—1° 2 (n > 1). 


The initial condition is aj = 1, since To = {QO}. 
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Using the recursion and initial condition, we calculate: 
(ao, Q41,42,43,44,45,.. ) = (1, 2, 4, 8, 16, 32, o s). 


The pattern suggests that a, = 2” for all n > 0. (We have already proved this earlier, but 
we wish to reprove this fact using our recursion.) We will prove that a, = 2” by induction 
on n. In the base case (n = 0), we have ap = 1 = 2° by the initial condition. Assume that 
n > 0 and that a,_; = 2”~! (this is the induction hypothesis). Using the recursion and the 
induction hypothesis, we see that 


t= a5 HO" ya 
This completes the proof by induction. 


2.23. Example: Fibonacci Words. Let W,, be the set of all words in {0,1}" that do 
not have two consecutive zeroes, and let f, = |W,,|. We now derive a recursion and initial 
condition for the sequence of f,,’s. First, direct enumeration shows that fo = 1 and f; = 2. 
Suppose n > 2. We use the sum rule to find a formula for |W,,| = f,. Given w € Wy, w starts 
with either 0 or 1. If w; = 0, then we are forced to have wg = 1, and then w’ = w3w4-:: Wn 
can be an arbitrary word in W,,_2. Therefore, there are f,—-2 words in W,, starting with 
0. On the other hand, if wy = 1, then w’ = we--+ wp can be an arbitrary word in W,_1. 
Therefore, there are f,_; words in W,, starting with 1. By the sum rule, 


fn = fn-1 + fn—2 (n > 2). 
Using this recursion and the initial conditions, we compute 
(fo, fi, fa, fs, fa, fs, ea 5) = ( 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, vhs Je 


This sequence is called the Fibonacci sequence. We will find an explicit closed formula for 
fn later (see 2.134(a) or §7.14). 


Now we consider some examples involving doubly indexed families of combinatorial 
objects. We begin by revisiting the enumeration of k-permutations, subsets, multisets, and 
anagrams. We will reprove some of our earlier counting results by recursive methods. 


2.24. Recursion for k-Permutations. For all integers n, k > 0, let P(n, k) be the number 
of k-permutations of an n-element set. (One can show bijectively that every n-element set 
has the same number of k-permutations, so that P(n,k) is a well-defined integer. Alterna- 
tively, we could define P(n, k) to be the number of k-permutations of a particular n-element 
set like {1,2,...,}. However, in the latter case, the argument in the text must be modified 
accordingly. Similar comments apply to later examples.) Recall that a k-permutation is an 
ordered sequence 21 22::- xx of distinct elements from the given n-element set. Observe that 
P(n,k) = 0 whenever & > n. On the other hand, P(n,0) = 1 for all n > 0 since the empty 
sequence is the unique 0-permutation of any n-element set. Now assume that 0 < k < n. 
We can build a typical k-permutation x = 2, 42---x, of a given n-element set X as follows. 
First, choose x1 in any of n ways. For the second choice, note that x’ = r9%3---ax, can be 
any (k — 1)-permutation of the (n — 1)-element set X ~ {x1}. There are P(n — 1,k — 1) 
choices for x’, by definition. The product rule thus gives us the recursion 


P(n,k) =nP(n—-1,k-1) (0<k<n). 


The initial conditions are P(n,0) = 1 for all n and P(n,k) =0 for all k > n. 
In §1.4, we used the product rule to prove that P(n,k) = n(n—1)---(n-k+4+1) = 
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n!/(n — k)! for 0 < k <n. Let us now reprove this result using our recursion. We proceed 
by induction on n. In the base case, n = 0 and hence & = 0. The initial condition gives 


P(n,k) = P(0,0) = 1=0!/(0 — 0)! = n!/(n— k)!. 
For the induction step, assume that n > 0 and that 
P(n-1,j)=(n-DI/(n-1-j)! (OS j<n-1). 
Fix k withO <k <n. If k =0, the initial condition gives 
P(n,k) = P(n,0) =1=n!/(n—0)!=n!l/(n— kyl. 
If k > 0, we use the recursion and induction hypothesis (applied to 7 = k — 1) to compute 


(n —1)! n! 
PUR ERE ee ays (reel 1 (n—hyt 


This completes the proof by induction. 


2.25. Recursion for k-element Subsets. For all integers n,k > 0, let C(n,k) be the 
number of k-element subsets of {1,2,...,2}. Observe that C(n,k) = 0 whenever k > n. On 
the other hand, the initial condition C(n,0) = 1 follows since the empty set is the unique 
zero-element subset of any set. Similarly, C(n,n) = 1 since {1, 2,...,} is the only n-element 
subset of itself. Let us now derive a recursion for C(n, &) assuming that 0 < k < n. A typical 
k-element subset A of {1,2,...,n} either does or does not contain n as a member. In the 
former case, we can construct A by choosing any (k — 1)-element subset of {1,2,...,—1} 
in C(n —1,k—1) ways, and then appending n as the final member of A. In the latter case, 
we can construct A by choosing any k-element subset of {1,2,...,2— 1} in C(n — 1,k) 
ways. By the sum rule, we deduce Pascal’s recursion 


C(n,k) = C(n-1,k-1)+C(n—-1,k) (0<k <n). 


For n > 0, this recursion even holds for k = 0 and k = n, provided we use the conventions 
that C(a,b) = 0 whenever b < 0 or b> a. 

In §1.8, we proved that C(n,k) = IGE! = (ya Let us reprove this result using the 
recursion and initial conditions. We proceed by induction on n. The base case n = k = 0 
follows since C(0,0) =1= MCEOTE Assume n > 0 and that 


(0<j<n-1). 


Fix k withO <k <n. If k =0, the initial condition gives 


Cln. k n! 
— 1 SS 

ven) Ol(n — 0)! 

as desired. Similarly, the result holds when k = n. If 0 < k < n, we use the recursion and 

induction hypothesis (applied to 7 = k — 1 and to j = k, which are integers between 0 and 
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n=0: 1 

m= 1: 1 al 

w= 2% 1 2 ‘. 

n=3: il 3 3 1 

n=A4: 1 4 6 4 1 

n=5: 1 5 10 10 5 1 

n=6: 1 6 15 20 15 6 1 

n=T7: 1 7 21 35 35 21 7 1 

n=8: 1 8 28 56 70 56 28 8 1 
FIGURE 2.4 


Pascal’s Triangle. 


n — 1) to compute 


C(n,k) = C(n—1,k-1)4+C(n—-1,k) 


k(n — k)! k(n — k)! 
_ (™-1)! 
= ae 
n! 
~ Ein —k) 


This completes the proof by induction. 


The reader may wonder what good it is to have a recursion for C(n, k), since we already 
proved by other methods the explicit formula C(n, k) = ICES There are several answers 
to this question. One answer is that the recursion for the C(n, k)’s gives us a fast method 
for calculating these quantities that is more efficient than computing with factorials. One 
popular way of displaying this calculation is called Pascal’s Triangle. We build this triangle 
by writing the n + 1 numbers C(n,0),C(n,1),...,C(n,n) in the nth row from the top. If 
we position the entries as shown in Figure 2.4, then each entry is the sum of the two entries 
directly above it. We compute C(n,k) by calculating rows 0 through n of this triangle. 

Note that computing C(n,k) via Pascal’s recursion requires only addition operations. 
In contrast, calculation using the closed formula TICEEOH requires us to divide one large 
factorial by the product of two other factorials. For example, Pascal’s Triangle quickly gives 
C(8,4) = 70, while the closed formula gives (§) aa an 70. 

In bijective combinatorics, it turns out that the arithmetic operation of division is much 
harder to understand (from a combinatorial standpoint) than the operations of addition and 
multiplication. In particular, our original derivation of the formula (7) = SICEEOH was an 


indirect argument using the product rule, in which we divided by k! at the end (§1.8). For 
later applications (e.g., listing all k-element subsets of a given n-element set, or randomly 
selecting a k-element subset), it is convenient to have a counting argument that does not 
rely on division. See Chapter 5 for more details. 

A final reason for studying recursions for C(n,k) is to emphasize that recursions are 
helpful and ubiquitous tools for studying combinatorial objects. Indeed, we will soon be 
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studying combinatorial collections whose cardinalities may not be given by explicit closed 
formulas. Nevertheless, these cardinalities satisfy recursions that allow them to be computed 
quickly and efficiently. 


Se 


2.5 Recursions for Multisets and Anagrams 


This section continues to give examples of combinatorial recursions for objects we have 
studied before, namely multisets and anagrams. 


2.26. Recursion for Multisets. In $1.11, we counted k-element multisets on an n- 
letter alphabet using bijective techniques. Now, we give a recursive analysis to reprove the 
enumeration results for multisets. For all integers n,k > 0, let M(n,k) be the number of 
k-element multisets using letters from {1,2,...,n}. The initial conditions are M(n,0) = 1 
for alln > 0 and M(0,k) = 0 for all k > 0. We now derive a recursion for M(n, k) assuming 
n>Oandk>0. A typical multiset counted by M(n,k) either does not contain n at all or 
contains one or more copies of n. In the former case, the multiset is a k-element multiset 
using letters from {1,2,...,n—1}, and there are M(n — 1,k) such multisets. In the latter 
case, if we remove one copy of n from the multiset, we obtain an arbitrary (k — 1)-element 
multiset using letters from {1,2,...,n}. There are M(n,k — 1) such multisets. By the sum 
rule, we obtain the recursion 


M(n,k) = M(n—-1,k)+ M(n,k-1) (n > 0,k > 0). 


One can now prove that for all n > 0 and all k > 0, 
k+n-1 (k+n-—1)! 
M(n,k) = = ——_——_.. 

(P) CS) k(n — 1)! 


The proof is by induction on n, and is similar to the corresponding proof for C(n,k). We 
leave this proof as an exercise. 

If desired, we can use the recursion to compute values of M(n,k). Here we use a left- 
justified table of entries in which the nth row contains the numbers M(n,0), M(n,1),.... 
The values in the top row (where n = 0) and in the left column (where & = 0) are given by 
the initial conditions. Each remaining entry in the table is the sum of the number directly 
above it and the number directly to its left. See Figure 2.5. The reader will perceive that 
this is merely a shifted version of Pascal’s Triangle. 


2.27. Recursion for Multinomial Coefficients. Let n1,...,n; be nonnegative integers 
that add to n. Let {a1,...,as} be a given s-letter alphabet, and let C(n;n1,...,ns) = 
|R(ay!---a™s)| be the number of n-letter words that are rearrangements of n; copies of a; 
for 1 <2i<_s. We proved in §1.9 that 


n! 


n 
Climo) = ( )-—= 
Ny1,-++5,Ms Ny Ng Ms: 


We now give a new proof of this result using recursions. 

Assume first that every n; is positive. For 1 < i < s, let T; be the set of words in 
T = R(aj'--- az?) that begin with the letter a;. T is the disjoint union of the sets T;. To 
build a typical word w € T;, we start with the letter a; and then append any element of 
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n= aR 0 0 0 0 0 0 
n= 1 1 1 1 1 1 i 1 
n= 1 2 3 4 5) 6 7 
n= 1 3 6 10 15 21 28 
n= 1 4 10 20 35 56 84 
n= 1 5 15 35 70 126 210 
n=6: 1 6 21 56 126 252 462 
FIGURE 2.5 
Table for computing M(n, k). 
R(att..-am—)...a™*). There are C(n — 1;n1,...,n; —1,...,ns) ways to do this. Hence, 


by the sum rule, 
C(n;n1,...,%5) = S>C(n- 1;ny,...,nj—1,...,M). 
i=1 


If we adopt the convention that C(n;n1,...,ns) = 0 whenever any n; is negative, then 
this recursion holds (with the same proof) for all choices of n; > 0 and n > 0. The initial 
condition is 


C(0;0,0,...,0) =1, 


since the empty word is the unique rearrangement of zero copies of the given letters. 
Now let us prove that 
) n! 
>) n = 1s. 
: TTpai me! 


by induction on n. In the base case, n = n, =-:: =n, = 0, and the desired formula follows 
from the initial condition. For the induction step, assume that n > 0 and that 


C(n;n1,... 


(n —1)! 


7 Thai mr} 


whenever m, +---+m, =n—1. Assume that we are given integers n, > 0 that sum to n. 
Now, using the recursion and induction hypothesis, we compute as follows: 


C(n—1;m4,...,ms) 


C(njn1,...,2) = CGH Tie ei ley) 
k=1 


_ - 7 (n—1)! 
7 2x( ONG IMT ign "9! 


= een So 1)Inp = “\ (n—1)!ng 
kel []j=1 "3! kel []j=1 75! 


(n—1)! |< n! 
en a Nk | = Fs: 
TTj=1 ny! », j= nj! 
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2.6 Recursions for Lattice Paths 


Recursive techniques allow us to count many collections of lattice paths. We first consider 
the situation of lattice paths in a rectangle. 


2.28. Recursion for Paths in a Rectangle. For a,b > 0, let L(a,b) be the number of 
lattice paths from the origin to (a,b). We have L(a,0) = L(0,b) = 1 for all a,b > 0. Ifa > 0 
and b > 0, note that any lattice path ending at (a,b) arrives there via an east step or a 
north step. We obtain lattice paths of the first kind by taking any lattice path ending at 
(a—1,b) and appending an east step. We obtain lattice paths of the second kind by taking 
any lattice path ending at (a,b — 1) and appending a north step. Hence, by the sum rule, 


L(a,b) = L(a— 1,6) + L(a,b—- 1) (a,b > 0). 


One can now show (by induction on a + b) that 


We can visually display and calculate the numbers L(a,b) by labeling each lattice point 
(a, b) with the number L(a, b). The initial conditions say that the lattice points on the axes 
are labeled 1. The recursion says that the label of some point (a, b) is the sum of the labels 
of the point (a — 1,b) to its immediate left and the point (a,b — 1) immediately below it. 
See Figure 2.6. 


e@:::-@:::@::: @::-@ 
1: 6: 21: 56: 126: 


1. 5: 15. 35. 70. 


FIGURE 2.6 
Recursive enumeration of lattice paths. 


By modifying the boundary conditions, we can adapt the recursion in the previous 
example to count more complicated collections of lattice paths. 


2.29. Recursion for Paths in a Triangle. For b > a > 0, let T(a,b) be the number 
of lattice paths from the origin to (a,b) that always stay weakly above the line y = x. (In 
particular, T(n,n) is the number of Dyck paths of order n.) By the same argument used 
above, we have 


T(a,b) =T(a—1,b)+T(a,b- 1) (b>a>0). 
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FIGURE 2.7 
Recursive enumeration of lattice paths in a triangle. 


On the other hand, when a = b > 0, a lattice path can only reach (a,b) = (a,a) by taking 
an east step, since the point (a,b — 1) lies below y = x. Thus, 


T(a,a) =T(a—1,a) (a > 0). 


The initial conditions are T(0,b) = 1 for all b > 0. Figure 2.7 shows how to compute the 
numbers T'(a, b) by drawing a picture. 


It turns out that there is an explicit closed formula for the numbers T’(a, b). 


2.30. Theorem: Ballot Numbers. For b > a > 0, the number of lattice paths from the 
origin to (a,b) that always stay weakly above the line y = x is 


b-—a+l/fa+b+4+l1 
b+at+l a ; 
In particular, the number of Dyck paths of order n is 


1 2n+1 
= Cy. 
sail n ) 


b—a+l/fa+bl 
T(a, b) = ———_ 
(at) = ( ) 


Proof. We show that 


a 
by induction on a+b. Ifa+b = 0, so that a = b = 0, then T(0,0) = 1 = Toy Cae Now 
assume that a+b > 0 and that T(c,d) = $S+# (*+4+") whenever d > c > 0 and c+d < a+b. 


To prove the desired formula for T(a,b), we consider cases based on the recursions and 
initial conditions. First, if a = 0 and b> 0, we have T(a,b) = 1 = rapt cara Second, if 
a=b> 0, we have 


T(a,b) = T(aa)=Ta-1a)=~(%*,) 


= oie sais’) 


7 b-—a+l/fa+bl 
~ bebak 1 a 
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Third, if b > a > 0, we have 


b-a+2/a+b\  b- +b 
T(a,b) = Ta 1,0) +T(a,b-1) =A (AFD) 1G ) 


a+b\ a 
= (b—a+2)(a+b—-1)! (b-—a)(a+b-1)! 
(a—1)!(b+1)! alo! 
_ ees (a + b)! 
a+b a+b al(b+ 1)! 
_ [ab-—a?+2a+b?-ab+b-a (a+b+1)! 
i. | a+b |e 
See 1 oe 
a+b b+a+l1 a 
b—-a+l/fa+b+l1l 
- at ( a } az 


The numbers T(a, b) in the previous theorem are called ballot numbers, for the following 
reason. Let 7 € {N, E}¢*° be a lattice path counted by Ta, b). Imagine that a + b people 
are voting for two candidates (“candidate N” and “candidate E”) by casting an ordered 
sequence of a+ 6 ballots. The path 7 records this sequence of ballots as follows: 7; = N if 
the jth person votes for candidate N, and 7; = EF if the jth person votes for candidate E. 
The condition that 7 stays weakly above y = « means that candidate N always has at least 
as many votes as candidate E at each stage in the election process. The condition that 7 
ends at (a,b) means that candidate N has b votes and candidate E has a votes at the end 
of the election. 

Returning to lattice paths, suppose we replace the boundary line y = « by the line 
y = ma (where m is any positive integer). We can then derive the following more general 
result. 


2.31. Theorem: m-Ballot Numbers. Let m be a fixed positive integer. For b > ma > 0, 
the number of lattice paths from the origin to (a,b) that always stay weakly above the line 
y =mz is 

ee) 


b+at+l a 


In particular, the number of such paths ending at (n, mn) is 


aon ("*). 


Proof. Let Tm(a,b) be the number of paths ending at (a,b) that never go below y = mz. 
Arguing as before, we have T,,(0, b) = 1 for all 6 > 0; Tn(a,b) = Tm(a— 1,6) + Tm(a, b- 1) 
whenever b > ma > 0; and Tn(a, ma) = Tm(a—1, ma) since the point (a, ma— 1) lies below 
the line y = ma. One now proves that 


Tn (a,b) b—mat+1/fa+b4+1 
m\a, = 
b+at+l1 a 


by induction on a+ b. The proof is similar to the one given above, so we leave it as an 
exercise. For a bijective proof of this theorem, see 12.92. O 


When the slope m of the boundary line y = mz is not an integer, we cannot use 
the formula in the preceding theorem. Nevertheless, the recursion (with appropriate initial 
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e:.-:-@:-:--@:::-@::-@:::@:::: 
1: 8: 35: 105: 241: 377 


: : °377 


FIGURE 2.8 
Recursive enumeration of lattice paths above y = (3/2)z. 


le PANES 20 X----:X +--+ @--: -@----@ 
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Xx . 15: 49 49. 68 
‘enkiguvacgacty ctneets g@poast-a 3 Mi Rh 
. 1: 5: 15: 34: . 19 
@----@---@:--:-@---@--:-@:-:: 
1 1: 4 10: 19: 19: 19 
ete eae Veer’ ikeat Port Care 6 
1 3 6: 9 


FIGURE 2.9 


Recursive enumeration of lattice paths in an irregular region. 
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conditions) can still be used to count lattice paths bounded below by this line. For example, 
Figure 2.8 illustrates the enumeration of lattice paths from (0,0) to (6,9) that always stay 
weakly above y = (3/2)a. 

We end this section with a general recursion for counting lattice paths in a given region. 


2.32. Theorem: General Lattice Path Recursion. Suppose V is a given set of lattice 
points in N x N containing the origin. For (a,b) € V, let Ty (a,b) be the number of lattice 
paths from the origin to (a,b) that visit only lattice points in V. Then Ty (0,0) = 1 and 


Ty (a, 6) = Ty(a—1, b)x((a—1, 6) € V) + Ty (a, b— 1)x((a,b- 1) € V) for (a,b) 4 (0,0). 


The proof is immediate from the sum rule. Figure 2.9 illustrates the use of this recursion 
to count lattice paths contained in an irregular region. In the figure, lattice points in V are 
drawn as closed circles, while X’s indicate certain forbidden lattice points that the path is 
not allowed to use. 


DS 


2.7 Catalan Recursions 


The recursions from the previous section provide one way of computing Catalan numbers, 
which are a special case of ballot numbers. This section discusses another recursion that 
involves only the Catalan numbers. This “convolution recursion” comes up in many settings, 
thus leading to many different combinatorial interpretations for the Catalan numbers. 


2.33. Theorem: Catalan Recursion. The Catalan numbers C), = 4") satisfy the 
recursion 


Ca= Yi CpCrae. “CeS0) 
k=1 


and initial condition Co = 1. 


Proof. Recall from 1.56 that C;, is the number of Dyck paths of order n. There is one Dyck 
path of order 0, so Co = 1. Fix n > 0, and let A be the set of Dyck paths ending at (n,n). 
For 1<k <n, let 


Ay ={n € A: (k,k) € w and (j,j) Zw for0< 7 < kh}. 


In other words, A, consists of the Dyck paths of order n that return to the diagonal line 
y = « for the first time at the point (k,k). See Figure 2.10. Suppose w is the word in 
{N, E}?" that encodes a path 7 € Ax. Inspection of Figure 2.10 shows that we have the 
factorization w = NwiEwe, where wi encodes a Dyck path of order k — 1 (starting at 
(0,1) in the figure) and wz encodes a Dyck path of order n — & (starting at (k,k) in the 
figure). We can uniquely construct all paths in A; by choosing w; and w2 and then setting 
w = Nw Ewe. There are Cx_ 1 choices for w; and Cy,_, choices for wz. By the product rule 
and sum rule, 


Cy = |Al = 5 |Ax| = 5 Cr_-1Cn_k. 
k=1 k=1 
O 


The next result shows that the Catalan recursion uniquely determines the Catalan num- 
bers. 
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(n,n) 


(k,k) 


(0,0) 


FIGURE 2.10 
Proving the Catalan recursion by analyzing the first return to y = a. 


2.34. Theorem. Suppose (d, : n > 0) is a sequence such that do = 1 and 
dn =>" dicitne “(aw > 0): 
k=1 


Then d, = 1, (?”) for all n > 0. 


n= pH 


Proof. We argue by strong induction. For n = 0, we have dg = 1 = Co. Assume that n > 0 
and that d,, = Cm for all m <n. Then 


dn = 7 dk-1dn—k =D) Ce-1Cn—k = Cn. Oo 
k=1 k=1 


We can now prove that various collections of objects are counted by the Catalan numbers. 
One proof method sets up a bijection between such objects and other objects (like Dyck 
paths) that are already known to be counted by Catalan numbers. A second proof method 
shows that the new collections of objects satisfy the Catalan recursion. We illustrate both 
methods in the examples below. 


2.35. Example: Balanced Parentheses. For n > 0, let BP, be the set of all words 
consisting of n left parentheses and n right parentheses, such that every left parenthesis can 
be matched with a right parenthesis later in the word. For example, BP3 consists of the 
following five words: 


(CO), COO, OFCO), COO), OOO. 


We show that |BP,,| = C,, for all n by exhibiting a bijection between BP,, and the set of 
Dyck paths of order n. Given w € BP3, replace each left parenthesis by N (which encodes 
a north step) and each right parenthesis by E (which encodes an east step). One can check 
that a string w of n left and n right parentheses is balanced iff for every 1 < 2n, the number 
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FIGURE 2.11 
The five binary trees with three nodes. 


FIGURE 2.12 
A binary tree with ten nodes. 


of left parentheses in the prefix w,w2---w; weakly exceeds the number of right parentheses 
in this prefix. Converting to north and east steps, this condition means that no lattice point 
on the path lies strictly below the line y = x. Thus we have mapped each w € BP, to a 
Dyck path. This map is a bijection, so |BP,,| = Ch. 


2.36. Example: Binary Trees. We recursively define the set of binary trees with n nodes 
as follows. The empty set is the unique binary tree with 0 nodes. If T) is a binary tree with 
h nodes and Tp) is a binary tree with k nodes, then the ordered triple T = (e, 71,72) is a 
binary tree with h + &+ 1 nodes. By definition, all binary trees arise by a finite number of 
applications of these rules. If T = (e,7),72) is a binary tree, we call T; the left subtree of 
T and T> the right subtree of T. Note that T, or Tz (or both) may be empty. We can draw 
a picture of a nonempty binary tree T as follows. First, draw a root node of the binary tree 
at the top of the picture. If T, is nonempty, draw an edge leading down and left from the 
root node, and then draw the picture of T,. If T> is nonempty, draw an edge leading down 
and right from the root node, and then draw the picture of 73. For example, Figure 2.11 
displays the five binary trees with three nodes. Figure 2.12 depicts a larger binary tree that 
is formally represented by the sequence 


T = (¢, (0, (¢, (00,0), (¢,0,0)), (¢, 0,0), (@, (0,0, (#, (0, 0,0), 0)),0)). 


Let BT, denote the set of binary trees with n nodes. We show that |BT),| = C,, for all n by 

verifying that the sequence (|BT;,| : n > 0) satisfies the Catalan recursion. First, |BTo| = 1 
by definition. Second, suppose n > 1. By the recursive definition of binary trees, we can 
uniquely construct a typical element of BT, as follows. Fix k with 1 < k < n. Choose a 
tree T; € BT;_1 with k — 1 nodes. Then choose a tree Ty € BT,» with n — k nodes. We 
assemble these trees (together with a new root node) to get a binary tree T = (e,7),T2) 
with (k —1)+1+4(n—k) =n nodes. By the sum and product rules, we have 


|BTn| = S>|BTx-1||BTn—xl- 
k=1 
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2.37. Example: 231-avoiding permutations. Suppose w = wiw2:::wWn is a permu- 
tation of n distinct integers. We say that w is 231-avoiding iff there do not exist indices 
i < k < p such that wp < wy < wy. This means that no three-element subsequence 
W;...Wk-.-Wp in w has the property that w, is the smallest number in {w;, wz, wp} and 
wz is the largest number in {w;,wz, wp}. For example, when n = 4, there are fourteen 
231-avoiding permutations of {1, 2,3, 4}: 


1234, 1243, 1324, 1423, 1432, 2134, 2143, 


3124, 3214, 4123, 4132, 4213, 4312, 4321. 


The following ten permutations do contain occurrences of the pattern 231: 
2314, 2341, 2431, 4231, 3421, 3412, 3142, 1342, 3241, 2413. 


Let $23! be the set of 231-avoiding permutations of {1,2,...,n}. We prove that |$?8!| = C. 

for all n > 0 by verifying the Catalan recursion. First, 15231) = = 1 = C since the empty 
permutation is certainly 231-avoiding. Next, suppose n > 0. We construct a typical object 
w € $23! as follows. Consider cases based on the position of the letter n in w. Say wp =n. 
For alli < k and all p > k, we must have w; < wp; otherwise, the subsequence w;, we = N, Wp 
would be an occurrence of the forbidden 231 pattern. Assuming that w; < w, whenever 
i<k <p, one checks that w = w,w2--- Wp is 231-avoiding iff wy we +--+ we_1 is 231-avoiding 
and Wr41°'+Wn is 231-avoiding. Thus, for a fixed k, we can ee w by choosing an 
arbitrary 231-avoiding permutation w’ of the k — 1 letters {1,2,...,k4 — 1} in |9731| ways, 
then choosing an arbitrary 231-avoiding permutation w” of the n — ‘ letters {k,...,n—1} 
in |S?! | ways, and finally letting w be the concatenation of w’, the letter n, and nal By 
the sum and product rules, we have 


|$23| = : heee4 | Feral h 


By 2.34; |$23"| =-C,, forall n > 0. 


2.38. Example: r-avoiding permutations. Let 7 : {1,2,...,k} — {1,2,...,k} bea 
fixed permutation of k letters. A permutation w of {1,2,...,n} is called r-avoiding iff there 
do not exist indices 1 < i(1) < 1(2) <--- < i(k) <n such that 


DAG) SIGE) SS Wa yy 


This means that no subsequence of & entries of w consists of numbers in the same relative 
order as the numbers 7;,72,...,7%. For instance, w = 15362784 is not 2341-avoiding, since 
the subsequence 5684 matches the pattern 2341 (as does the subsequence 5674). On the 
other hand, w is 4321-avoiding, since there is no descending subsequence of w of length 4. 
Let S7 denote the set of 7-avoiding permutations of {1,2,...,n}. 

For general 7, the enumeration of 7-avoiding permutations is an extremely difficult 
problem that has stimulated much research in recent years. On the other hand, if 7 is a 
permutation of k = 3 letters, then the number of 7-avoiding permutations of length n is 
always the Catalan number Cy, for all six possible choices of 7. We have already proved 
this in the last example for 7 = 231. The arguments in that example readily adapt to prove 
the Catalan recursion for 7 = 132, 7 = 213, and 7 = 312. However, more subtle arguments 
are needed to prove this result for 7 = 123 and r = 321 (see 12.65). 
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2.39. Remark. Let (A, : > 0) and (B,, : n > 0) be two families of combinatorial objects 
such that |A,,| = C, = |B,| for all n. Suppose that we have an explicit bijective proof that 
the numbers |A,,| satisfy the Catalan recursion. This means that we can describe a bijection 
gn between the set A, and the disjoint union of the sets Ax_1 xX An_x for k = 1,2,...,n 
(Such a bijection is usually implicit in an argument involving the sum and product rules.) 
Suppose we have similar bijections h,, for the sets B,. We can combine these bijections 
to obtain recursively defined bijections fn : An — Bn. First, there is a unique bijection 
fo : Ao — Bo, since |Ao| = 1 = |Bo|. Second, assume that fm : Am — Bm has already 
been defined for all m < n. Define fn, : An — Bn as follows. Given x € A,, suppose 
Gn(x) = (ky, 2) where 1 <k <n, y © Ap_i, and z € An_p. Set 


fale) = hy! ((k, fr-1y), fn—e(2))): 


The inverse map is defined analogously. 

For example, let us recursively define a bijection ¢ from the set of binary trees to the set 
of Dyck paths such that trees with n nodes map to paths of order n. Linking together the 
first-return recursion for Dyck paths with the left/right-subtree recursion for binary trees 
as discussed in the previous paragraph, we obtain the rule 


(0) = the empty word (denoted ¢); ((#, Ti, 73)) = NO(T1) Bo(T2). 


For example, the one-node tree (e, @,) maps to the Dyck path NeHe = NE. It then follows 
that 
o(e, (e,9,0),0) = N(NE)Ee = NNEE; 


eae ,0)) = NeE(NE) = NENE; 
o(e, (©, 0,0), (¢,0,0)) = N(NE)E(NE) = NNEENE; 


and so on. Figure 2.13 illustrates the recursive computation of ¢(T) for the binary tree T 
shown in Figure 2.12. 

As another example, let us recursively define a bijection ~ from the set of binary trees 
to the set of 231-avoiding permutations such that trees with n nodes map to permutations 
of n letters. Linking together the two proofs of the Catalan recursion for binary trees and 
231-avoiding permutations, we obtain the rule 


vO)=c, P(e, Ti, T2)) = o(Ti) ny" (Ta), 


where 7’(T2) is the permutation obtained by increasing each entry of w(T2) by k—1 = |T)]. 
Figure 2.14 illustrates the recursive computation of w(T) for the binary tree T shown in 
Figure 2.12. 


ne 


2.8 Integer Partitions 


2.40. Definition: Integer Partitions. Let n be a nonnegative integer. An integer partition 
of n is a sequence (t= ([11, f2,---, Ux) of positive integers such that i + w2+---+ pe, =n 
and f41 > fg > ++: > pe. Each p; is called a part of the partition. Let p(n) be the number of 
integer partitions of n, and let p(n, k) be the number of integer partitions of n into exactly 
k parts. If is a partition of n into k parts, we write |u| =n and ¢(yw) = k and say that pu 
has area n and length k. Let Par denote the set of all integer partitions. 
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FIGURE 2.13 
Mapping binary trees to Dyck paths. 
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FIGURE 2.14 
Mapping binary trees to 231-avoiding permutations. 
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2.41. Example. The integer partitions of 5 are 
(5), (4,1), (3,2), (3,1,1), (2,2, 1), (2,1,1,1), (1,1,1,1,1). 


Thus, p(5) = 7, p(5,1) = 1, p(5, 2) = 2, p(5,3) = 2, p(5, 4) = 1, and p(5,5) = 1. As another 
example, the empty sequence is the unique integer partition of 0, so p(0) = 1 = p(0,0). 


An integer partition of n is a composition of n in which the parts appear in weakly 
decreasing order. Informally, we can think of an integer partition of n as a way of writing 
nm as a sum of positive integers where the order of the summands does not matter. 

We know from 1.41 that there are 2”~! compositions of n. One might hope for a similar 
explicit formula for p(n). Such a formula does exist (see 2.49 below), but it is extraordinarily 
complicated. Fortunately, the numbers p(n, k) do satisfy a nice recursion. 


2.42. Theorem: Recursion for Integer Partitions. Let p(n,k) be the number of 
integer partitions of n into k parts. Then 


p(n, k) = p(n—-1,k —1) + p(n—k,k) (n,k > 0). 


The initial conditions are p(n,k) = 0 for k > n or k < 0, p(n,0) = 0 for n > 0, and 
p(0,0) = 1. 


Proof. For all i,j, let P(i,7) be the set of integer partitions of 7 into 7 parts. We have 
|P(2,7)| = p(i, 7). Fix n,k > 0. The set P(n,k) is the disjoint union of the two sets 


Q = {(ua,---5 Me) € P(n,k) : pe = Il}, 

R {(u1,---, 4) € P(n,k) : wx > 1}. 
On one hand, the map (41,..., Ue) + (M1,---, Ue—1) is a bijection from Q onto P(n—1, k—1) 
with inverse (11,...,U¥p—1) 2 (M%,---,Y¥e—1, 1). So |Q| = |P(n—1, k—1)| = p(n—1, k—-1). On 
the other hand, the map (f11,..., fe) > (uW1—1, we2—-1,...,ue—1) is a bijection from R onto 
P(n—k, k) with inverse (p1,..., Px) 2 (pr +1,...,pr+1). So |R| = |P(n—k,k)| = p(n—k, k). 
The sum rule now gives 


I 


2.43. Theorem: Dual Recursion for Partitions. Let p’(n, k) be the number of integer 
partitions of n with first part k. Then 


p(n, k) =p'(n—1,k-—1)+p'(n—k,k) (n, k > 0). 
The initial conditions are p’(n,k) = 0 for k > n or k < 0, p'(n,0) = 0 for n > 0, and (by 
convention) p’(0,0) = 1. 


Proof. For all i,j, let P’(i,7) be the set of integer partitions of i with first part 7. We have 
|P’(2,9)| = p'(t, 9). Fix n,k > 0. The set P’(n,k) is the disjoint union of the two sets 


Q = {(M1 =k, pe,---,Hs) € P'(n,k) : 1 > pe} 

R= {(41 =k, pe,..., Hs) € P’(n,k) : wa = My}. 
(If «4 has only one part, we take 2 = O by convention.) On one hand, the map 
(k, f2,---, fs) + (k —1,p12,..., 5) is a bijection from Q onto P’(n — 1,k — 1) with in- 
verse (k — 1,12,...,Us) > (k,2,...,Us). So |Q| = |P’(n-—1,k-1)| = p’'(n-1,k -1). 
On the other hand, the map (k, f2,..., 4s) 2 (2, U3,---, ls) is a bijection from R onto 
P'(n—k,k) with inverse (p1,..., 9s) (kK, p1,---; Ps). So |R| = |P!(n—k,k)| = p'(n—k,k). 
The sum rule now gives 


p(n, k) = |P!(n, k)| = |Q| + [Rl = p(n - 1,k- 1) + p'(n—k,k). O 
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(4,1) (2,1,1,) 
(3,2) 
(1,1,1,1,) 
(3,1,1) 


FIGURE 2.15 
Partition diagrams. 


2.44. Theorem: First Part vs. Number of Parts. The number of integer partitions of 
n into k parts equals the number of integer partitions of n with first part k. 


Proof. We prove p(n,k) = p'(n,k) for all n and all k by induction on n. The case n = 0 
is true since the initial conditions show that p(0,k) = p’'(0,k) for all k. Now assume that 
n > 0 and that p(m,k) = p'(m,k) for all m < n and all k. If k = 0, then p(n, 0) = p’(n,0) 
follows from the initial conditions. If k > 0, compute 


p(n, k) _ p(n =: 1, k— 1) + p(n a k, k) = p(n _ 1, k— 1) + p'(n - k, k) = p(n, k). O 
We now describe a convenient way of visualizing integer partitions. 


2.45. Definition: Diagram of a Partition. Let pu = (1, J2,..., uz) be an integer par- 
tition of n. The diagram of pu is the set 


dge(u) ={(,j) ENxXN:1<ti<k, 1 <j < pi}. 


We can make a picture of dg(jz) by drawing an array of n boxes, with ju; left-justified boxes 
in row i. For example, Figure 2.15 illustrates the diagrams for the seven integer partitions 
of 5. Note that || = wi +--+: + ue = | dg(y)| is the total number of boxes in the diagram 
of LL. 


2.46. Definition: Conjugate Partitions. Suppose yy is an integer partition of n. The 
conjugate partition of is the unique integer partition py’ of n satisfying 


dg(u’) = {G,4) : (4,9) € dg(u)}. 


In other words, we obtain the diagram for y’ by interchanging the rows and columns in the 
diagram for yu. For example, Figure 2.16 shows that the conjugate of yw = (7,4,3,1,1) is 
mw’ = (5, 3,3, 2,1, 1,1). 


We can now give pictorial proofs of some of the preceding results concerning p(n, k) 
and p'(n,k). Note that the length of a partition yz is the number of rows in dg(), while 
the first part of yz is the number of columns of dg(w). Hence, conjugation gives a bijection 
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conjugate 
(7,4,3,1,)) 
(5,3,3,2,1,1,1) 
FIGURE 2.16 
Conjugate of a partition. 
n=13, k=5 n-1=12, k-1=4 
remove lowest cell 
> 
n=13, k=5 n-k=8, k=5 


remove leftmost column 


> 


FIGURE 2.17 
Pictorial proof of the recursion for p(n, k). 


between the partitions counted by p(n,k) and the partitions counted by p’(n,k), so that 
p(n, k) = p’(n,k). Similarly, consider the proof of the recursion 


p(n, k) = p(n—1,k —1) + p(n—k,k). 


Suppose sz is a partition of n into k parts. If dg(j) has one box in the lowest row, we remove 
this box to get a typical partition counted by p(n — 1,k — 1). If dg(j) has more than one 
box in the lowest row, we remove the entire first column of the diagram to get a typical 
partition counted by p(n — k,k). See Figure 2.17. 

Our next result counts integer partitions whose diagrams fit in a box with b rows and a 
columns. 


2.47. Theorem: Enumeration of Partitions in a Box. The number of integer partitions 
pe such that dg(w) C {1,2,...,b} x {1,2,..., a} is 


(ees (a+b)! 


a,b ~~ albl 


Proof. We define a bijection between the set of integer partitions in the theorem statement 
and the set of all lattice paths from the origin to (a,b). We draw our partition diagrams 
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u=(10,10,5,4,4,4,2) 


(0,b) (a,b) 


(0,0) (a,0) 


FIGURE 2.18 
Counting partitions that fit in an a x b box. 


in the box with corners (0,0), (a,0), (0,6), and (a,b), as shown in Figure 2.18. Given a 
partition p whose diagram fits in this box, the southeast boundary of dg(jz) is a lattice path 
from the origin to (a,b). We call this lattice path the frontier of 4 (which depends on a and 
b as well as 4). For example, if a = 16, b = 10, and w = (10,10,5,4,4,4,2), we see from 
Figure 2.18 that the frontier of ju is 


NNNEENEENNNENEEEEENNEEEEEFE. 


Conversely, given any lattice path ending at (a,b), the set of lattice squares northwest of 
this path in the box uniquely determines the diagram of an integer partition. We already 
know that the number of lattice paths from the origin to (a,b) is ear so the theorem 
follows. O 


2.48. Remark: Euler’s Partition Recursion. Our recursion for p(n,k) gives a quick 
method for computing the quantities p(n, k) and p(n) = >;_, p(n, k). One may ask whether 
the numbers p(n) satisfy any recursion. In fact, Euler’s study of the infinite product []72 , (1- 
x’) leads to the following recursion for p(n): 


p(n) = So (-1)™"' [p(n — m(3m — 1)/2) + p(n — m(3m + 1)/2)] 


p(n —1) + p(n — 2) — p(n — 5) — p(n— 7) + p(n — 12) + p(n — 15) 
p(n — 22) — p(n — 26) + p(n — 35) + p(n — 40) — p(n — 51) — p(n — 57) 4+---. 


The initial conditions are p(0) = 1 and p(j) = 0 for all 7 < 0. It follows that, for each fixed 
n, the recursive expression for p(n) is really a finite sum, since the terms become zero once 
the argument to p becomes negative. For example, Figure 2.19 illustrates the calculation of 
p(n) from Euler’s recursion for 1 <n < 12. We shall prove Euler’s recursion later (see 8.27 
and 8.87). 


2.49. Remark: Hardy-Rademacher-Ramanujan Formula for p(n). There exists an 
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pil) = p(0)=1 

p(2) = pil)+p(0)=1+1=2 

p(3) = p(2)+p(1)=2+1=3 

p(4) = p(3)+p(2)=3+2=5 

p(5) = p(4)+p(3) —p(0) =5+3-1=7 

p(6) = p(5)+p(4)-pQ)=74+5-1=11 

P(7) = p(6)+p(5) — p(2)— p(0) =11+7-2-1=15 

Pp(8) = p(7) + p(6) — p(3) — pd) = 154+ 11-3—-1= 22 

p(9) = p(8) + p(7) — p(4) — p(2) = 22+ 15 -5- 2 = 30 
p(10) = p(9) +p(8) — p(5) — p(3) = 30+ 22-—7-3= 42 
p(11) = p(10) + p(9) — p(6) — p(4) = 42 + 30 - 11-5 = 56 
p(12) = p(11) + p(10) — p(7) — p(5) + p(0) = 56 + 42 -—15-74+1=77. 

FIGURE 2.19 


Calculating p(n) using Euler’s recursion. 


explicit, non-recursive formula for the number of partitions of n. The formula is 


n= gb Ae a | 
_ v~ 34 a 


where 
Ax (n) = Se Ne 
1<h<k: 
gcd(h,k)=1 
and wp, is a certain complex 24kth root of unity. By estimating p(n) by the first term of 
this series, one can deduce the following asymptotic formula for p(n): 


p(n) ~ = exp [pV2n/3] ; 


We will not prove these results. For more details, consult Andrews [5, Chapter 5]. 


2.9 Set Partitions 


2.50. Definition: Set Partitions. Let X be a set. A set partition of X is a collection P 
of nonempty, pairwise disjoint subsets of X whose union is X. Each element of P is called 
a block of the partition. The cardinality of P (which may be infinite) is called the number 
of blocks of the partition. 

For example, if X = {1,2,3,4,5,6,7, 8}, then 


P= {{3, 5, 8}, {As ‘an {2}, {4, 6}} 
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FIGURE 2.20 


A picture of the set partition {{3,5,8}, {1,7}, {2}, {4, 6}}. 


is a set partition of X with four blocks. Note that the ordering of the blocks in this list, and 
the ordering of the elements within each block, is irrelevant when deciding the equality of 
two set partitions. For instance, 


{16,4}, {1, 7}, {2}, {5,8 3h} 


is the same set partition as the partition P mentioned above. It is convenient to visualize a 
set partition P by drawing the elements of X in a circle, and then drawing smaller circles 
enclosing the elements of each block of P. See Figure 2.20. 


2.51. Definition: Stirling Numbers and Bell Numbers. Let S(n,k) be the number 
of set partitions of {1,2,...,n} into exactly k blocks. S(n,k) is called a Stirling number 
of the second kind. Let B(n) be the total number of set partitions of {1,2,...,n}. B(n) is 
called a Bell number. One can check that S(n,k) is the number of partitions of any given 
n-element set into & blocks; similarly for B(n). 


Stirling numbers and Bell numbers are not given by closed formulas involving factorials, 
binomial coefficients, etc. (although there are summation formulas and generating functions 
for these quantities). However, the Stirling numbers satisfy a recursion that can be used to 
compute S(n,k) and B(n) quite rapidly. 


2.52. Theorem: Recursion for Stirling Numbers of the Second Kind. For all n > 0 
and k > 0, 
S(n,k) = S(n-—1,k-1)+kS(n—-1,k). 


The initial conditions are $(0,0) = 1, S(n,0) = 0 for n > 0, and S(0,k) = 0 for k > 0. 
Furthermore, B(0) = 1 and B(n) = 77_, S(n,k) for n > 0. 


Proof. Fix n,k > 0. Let A be the set of set partitions of {1,2,...,n} into exactly k blocks. 
Let A’ = {PE A: {n} € P} and A” = {PE A: {n} ¢ P}. A is the disjoint union of 
A’ and A”. A’ consists of those set partitions such that n is in a block by itself, while A” 
consists of those set partitions such that n is in a block with some other elements. To build 
a typical partition P € A’, we first choose an arbitrary set partition Pp of {1,2,...,n—1} 
into k — 1 blocks in any of S(n —1,k—1) ways. Then we add {n} to Po to get P. To build 
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k=0 k=1 k=2 b=3 k=4 k=5 b=6 K=7 k=8 Bn) 
n=0: 1 0 0 0 0 0 0 0 0 1 
n=1: 0 1 0 0 0 0 0 0 0 1 
n=2: 0 1 1 0 0 0 0 0 0 2 
n=3: 0 1 3 i 0 0 0 0 0 5 
n=A4: 0 1 7 6 1 0 0 0 0 15 
n=5: 0 1 15 25 10 1 0 0 0 52 
n=6: 0 1 31 90 65 15 1 0 0 203 
n=T: 0 1 63 301 350 140. 21 1 0 877 
n=8: 0 1 127 966 1701 1050 266 28 1 4140 

FIGURE 2.21 


Calculating S(n,k) and B(n) recursively. 


a typical partition P € A”, we first choose an arbitrary set partition P, of {1,2,...,n—1} 
into k blocks in any of S(n —1,k) ways. Then we choose one of these & blocks and add n 
as a new member of that block. By the sum and product rules, 


S(n,k) =|A| =|A'| + |A”| = S(n—1,k— 1) + kS(n—-1,h). 


The initial conditions are immediate from the definitions (note that P = @) is the unique set 
partition of X =). The formula for B(n) follows from the sum rule. O 


Figure 2.21 computes S(n,k) and B(n) for n < 8 using the recursion from the last 
theorem. Note that each entry S(n,k) in row n and column k is computed by taking the 
number immediately northwest and adding k times the number immediately above the given 
entry. The numbers B(n) are found by adding the numbers in each row. 

The Bell numbers also satisfy a nice recursion. 


2.53. Theorem: Recursion for Bell Numbers. For all n > 0, 


n—1 


B(n) = 5 (";, ))Btr-1-8) 


k=0 
The initial condition is B(0) = 1. 


Proof. For n > 0, we construct a typical set partition P counted by B(n) as follows. Let 
k; be the number of elements in the block of P containing n, not including n itself; thus, 
0<k<n-—1. To build P, first choose k elements from {1,2,...,n —1} that belong to 
the same block as n in any of (er) ways. Then, choose an arbitrary set partition of the 
n—1—k elements that do not belong to the same block as n; this choice can be made in 
any of B(n — 1—k) ways. The recursion now follows from the sum and product rules. O 


For example, assuming that B(m) is already known for m < 8 (cf. Figure 2.21), we 
calculate 


B(8) = (() 30 ‘ (7) 206) + (3) 80) dtc (7) 80 
= 1-8774+7-2034+ 21-524 35-154 35-54 21-24+7-141-1 


4140. 
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We close this section by reviewing the connection between set partitions and equivalence 
relations. 


2.54. Definition: Types of Relations. Let X be any set. A relation on X is any subset 
of X x X. If Ris a relation on X and z,y € X, we often write «Ry as an abbreviation 
for (x,y) € R. We read this symbol as “a is related to y under R.” A relation R on X is 
reflexive on X iff eRa for alla € X. R is irreflexive on X iff xRzx is false for alla ec X. R 
is symmetric iff for all x,y € X, xRy implies yRz. R is antisymmetric iff for all x,y € X, 
zRy and yRz imply « = y. R is transitive iff for all z,y,z © X, xRy and yRz imply xRz. 
R is an equivalence relation on X iff R is symmetric, transitive, and reflexive on X. If R 
is an equivalence relation and x) € X, the equivalence class of xq relative to R is the set 
[tole = {y € X : yRao}. 


2.55. Theorem: Set Partitions vs. Equivalence Relations. Suppose X is a fixed set. 
Let A be the set of all set partitions of X, and let B be the set of all equivalence relations on 
X. There are canonical bijections ¢: A — B and ¢' : B > A. If P € A, then the number of 
blocks of P equals the number of equivalence classes of ¢(P). Hence, S(n, k) is the number 
of equivalence relations on an n-element set having k equivalence classes, and B(n) is the 
number of equivalence relations on an n-element set. 


Proof. We sketch the proof, leaving certain details as exercises. Given a set partition P € A, 
define 


M(P)={(a,y)EeX:5ASeE PweSandye S}. 


In other words, «é(P)y iff x and y belong to the same block of P. The reader should 
check that ¢(P) is indeed an equivalence relation on X, i.e., that ¢(P) € B. Thus, ¢ is a 
well-defined function from A into B. 

Given an equivalence relation R € B, define 


¢'(R) ={[a]p: ve X}. 


In other words, the blocks of ¢’(R) are precisely the equivalence classes of R. The reader 
should check that ¢’(R) is indeed a set partition of X, i.e., that ¢’(R) € A. Thus, ¢’ is a 
well-defined function from 6 into A. 

To complete the proof, the reader should check that ¢ and ¢’ are two-sided inverses of 
one another. In other words, prove that for all P € A, ¢'(¢(P)) = P; and for all R € B, 
prove that ¢(¢'(R)) = R. It follows that ¢ and ¢’ are bijections. oO 


DS 


2.10 Surjections 


Recall that a function f : X — Y is a surjection iff for every y € Y, there exists x € X 
with f(x) = y. 


2.56. Definition: Surj(n,k). Let Surj(n,&) denote the number of surjections from an n- 
element set onto a k-element set. 


2.57. Theorem: Recursion for Surjections. For n > k > 0, 
Surj(n, k) = k Surj(n — 1,k — 1) + kSurj(n — 1,4). 


The initial conditions are Surj(n,k) = 0 for n < k, Surj(0,0) = 1, and Surj(n,0) = 0 for 
n> 0. 
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Proof. Fix n > k > 0. Let us build a typical surjection f : {1,2,...,n} > {1,2,...,k} by 
considering two cases. Case 1: f(i) # f(n) for all i < n. In this case, we first choose f(n) in 
k, ways, and then we choose a surjection from {1,2,...,n — 1} onto {1,2,...,k} ~ {f(n)} 
in Surj(n — 1,k — 1) ways. The total number of possibilities is k Surj(n — 1,k — 1). 

Case 2: f(n) = f(i) for some i < n. In this case, note that the restriction of f 
to {1,2,...,2 — 1} is still surjective. Thus we can build f by first choosing a surjec- 
tion from {1,2,...,n — 1} onto {1,2,...,k} in Surj(n — 1,4) ways, and then choosing 
f(n) € {1,2,...,&} in & ways. The total number of possibilities is k Surj(n — 1,k). The 
recursion now follows from the sum rule. 

The initial conditions are immediate, once we note that the function with graph 0) is the 
unique surjection from @ onto 0. Oo 


Surjections are closely related to Stirling numbers of the second kind. Indeed, we have 
the following relation between Surj(n, &) and S(n, k). 


2.58. Theorem. For all n,k > 0, Surj(n,&) = k!S(n,k). 


Proof. We give two proofs. First Proof: We argue by induction on n. The result holds when 
n = 0 and & is arbitrary, since Surj(0,k) = x(k = 0) = 0!5(0,k). Assume that n > 0 and 
that Surj(m, k) = k!.S(m,k) for all k and all m < n. Using the recursions for Surj(n, k) and 
S(n,k), we compute 


Surj(n, k) k Surj(n — 1,k —1) + kSurj(n — 1,&) 
= k(k—1)!S(n—1,k—1) +k(kS(n —1,k) 
kKYS(n —1,k —1)+kS(n—-1,k)] 


= k!S(n,k). 


I 


Second Proof: We prove the formula by a direct counting argument. To construct a 
surjection f : {1,2,...,n} — {1,2,...,k}, first choose a set partition P of {1,2,...,n} into 
k, blocks in any of S(n,k) ways. Choose one of these blocks (in k ways), and let f map 
everything in this block to 1. Then choose a different block (in k — 1 ways), and let f map 
everything in this block to 2. Continue similarly; at the last stage, there is 1 block left, and 
we let f map everything in this block to k. By the product rule, 


Surj(n,k) = S(n,k)-k-(k—-1)-...-1=k!S(n,k). Oo 


2.59. Example. To illustrate the second proof, suppose n = 8 and k = 4. In the first 
step, let us choose the partition P = {{1,4,7}, {2}, {3, 8}, {5, 6}}. In the next four steps, 
we choose a permutation of the four blocks of P, say 


{2}, {5, 6}, {3, 8}, {1, 4, 7}. 


Now we define the associated surjection f by setting 


f(2)=1, f(5) = f(6) =2, FB) = F(8) =3, FO) =f4) =f) =4. 


2.11 Stirling Numbers and Rook Theory 


Recall that the Stirling numbers of the second kind (denoted S(n,k)) count the number of 
set partitions of an n-element set into k blocks. This section gives another combinatorial 
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interpretation of these Stirling numbers. We show that S(n,k) counts certain placements 
of rooks on a triangular chessboard. A slight variation of this setup leads us to introduce 
the (signless) Stirling numbers of the first kind. The relationship between the two kinds of 
Stirling numbers will be illuminated in the following section. 


2.60. Definition: Ferrers Boards and Rooks. A Ferrers board is the diagram of an 
integer partition, viewed as a collection of unit squares as in §2.8. A rook is a chess piece 
that can occupy any of the squares in a Ferrers board. In chess, a rook can move any 
number of squares horizontally or vertically from its current position in a single move. A 
rook located in row i and column 7 of a Ferrers board attacks all squares in row 7 and all 
squares in column j. 


For example, in the Ferrers board shown below, the rook R attacks all squares on the 
board marked with a dot (and its own square). 


For each n > 0, let A, denote the diagram of the partition (n — 1,n — 2,...,3,2,1). Ap is 
a triangular Ferrers board with n(n — 1)/2 total squares. For example, 


As = 


2.61. Definition: Non-attacking Rook Placements. A placement of k rooks on a given 
Ferrers board is a subset of k squares in the Ferrers board. These k squares represent the 
locations of k identical rooks on the board. A placement of rooks in a Ferrers board is 
called non-attacking iff no rook occupies a square attacked by another rook. Equivalently, 
all rooks in the placement occupy distinct rows and distinct columns of the board. 


2.62. Example. The following diagram illustrates a non-attacking placement of 3 rooks 
on the Ferrers board corresponding to the partition (7,4, 4, 3, 2). 


2.63. Theorem: Rook-Theoretic Interpretation of Stirling Numbers of the Sec- 
ond Kind. For n > 0 and 0 < k < n, let S’(n,k) denote the number of non-attacking 
placements of n — k rooks on the Ferrers board A,. Ifn > 1 and0<k <n, then 


S'(n,k) = S’(n-1,k-—1) +kS'(n —1,k). 


The initial conditions are S’(n,0) = 0 and S’(n,n) = 1 for all n > 0. Therefore, S’(n,k) = 
S(n,k), a Stirling number of the second kind. 


Proof. Fix n > 1 with 0 < k < n. Let A,B,C denote the set of placements counted by 
S’(n,k), S’(n—1,k—-1), and S’(n—1,k), respectively. Let Ao consist of all rook placements 
in A with no rook in the top row, and let Ai consist of all rook placements in A with one 


Combinatorial Identities and Recursions 77 


rook in the top row. A is the disjoint union of Ag and A,. Deleting the top row of the Ferrers 
board A,, produces the smaller Ferrers board A,,_1. It follows that deleting the (empty) top 
row of a rook placement in Ag gives a bijection between Ap and B (note that a placement 
in B involves (n — 1) — (k — 1) = n—k rooks). On the other hand, we can build a typical 
rook placement in A, as follows. First, choose a placement of n — k — 1 non-attacking rooks 
from the set C, and use this rook placement to fill the bottom n — 1 rows of A,. These 
rooks occupy n—k—1 distinct columns. This leaves (n—1)—(n—k-—1) =k columns in the 
top row in which we are allowed to place the final rook. By the product rule, |Ai| = |C|k. 
We conclude that 


S'(n,k) = |A| = |Ao| + |Ai| = |B] + &|C| = $’(n — 1,k — 1) + kS"(n -1,). 


We cannot place n non-attacking rooks on the Ferrers board A,, (which has only n — 1 
columns), and hence S’(n,0) = 0. On the other hand, for any n > 0 there is a unique 
placement of zero rooks on A,,. This placement is non-attacking (vacuously), and hence 
S’(n,n) = 1. Counting set partitions, we see that S(n,0) = 0 and S(n,n) = 1 for all 
n > 0. Since S’(n,k) and S(n, k) satisfy the same recursion and initial conditions, a routine 
induction argument (cf. 2.34) shows that S’(n,k) = S(n,k) for all n and k. O 


2.64. Remark. We have given combinatorial proofs that the numbers S’(n,k) and S(n, k) 
satisfy the same recursion. We can link together these proofs to get a recursively defined 
bijection between rook placements and set partitions, using the ideas in 2.39. We can also 
directly define a bijection between rook placements and set partitions. We illustrate such 
a bijection via an example. Figure 2.22 displays a rook placement counted by $’(8,3). We 
write the numbers 1 through n below the last square in each column of the diagram, as shown 
in the figure. We view these numbers as labeling both the rows and columns of the diagram; 
note that the column labels increase from left to right, while row labels decrease from top 
to bottom. The bijection between non-attacking rook placements a and set partitions P 
acts as follows. For all 7 <i <n, there is a rook in rowz and column j of a iff 7 and 7 are 
consecutive elements in the same block of P (when the elements of the block are written 
in increasing order). For example, the rook placement 7 in Figure 2.22 maps to the set 
partition 
P = {{1,3,4,5, 7}, {2, 6}, {8}}. 

The set partition {{2}, {1,5, 8}, {4,6, 7}, {3}} maps to the rook placement shown in Fig- 
ure 2.23. 

One may check that a non-attacking placement of n — k rooks on A, corresponds to a 
set partition of n with exactly k blocks; furthermore, the rook placement associated to a 
given set partition is automatically non-attacking. 


2.65. Definition: Wrooks and Stirling Numbers of the First Kind. A wrook (weak 
rook) is a new chess piece that attacks only the squares in its row. For all n > 0 and 
0<k <n, let s'(n,k) denote the number of placements of n — k non-attacking wrooks 
on the Ferrers board A,,. The numbers s‘(n,k) are called signless Stirling numbers of the 
first kind. The numbers s(n,k) = (—1)"~"s'(n,k) are called (signed) Stirling numbers of 
the first kind. Another combinatorial definition of the Stirling numbers of the first kind will 
be given in §3.6. By convention, we set s(0,0) = 1 = s’(0,0). 


2.66. Theorem: Recursion for Signless Stirling Numbers of the First Kind. If 
n>land0<k< vn, then 


s(n, k) = s'(n-—1,k —1) + (n—1)s'(n—1,k). 


The initial conditions are s’(n,0) = x(n = 0) and s’(n,n) = 1. 
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FIGURE 2.22 
A rook placement counted by S’(n,k), where n = 8 and k = 3. 


FIGURE 2.23 
The rook placement associated to {{2}, {1,5, 8}, {4, 6, 7}, {3}}. 


k=0 k=1 k=2 k=3 k=4 k=5 k=6 k=7 
n=0: 1 0 0 0 0 0 0 0 
n=1: 0 1 0 0 0 0 0 0 
n=2: 0 —1 1 0 0 0 0 0 
n=3: 0 2 —3 1 0 0 0 0 
n=A4: 0 —6 11 —6 1 0 0 0 
n=5: 0 24 —50 35 —10 1 0 0 
n=6: 0  —120 274 —225 85 —15 1 0 
n=T7: 0 720 -—1764 1624 —735 175 9-21 1 

FIGURE 2.24 


Signed Stirling numbers of the first kind. 
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Proof. Fix n > 1 with 0 < k < n. Let A,B,C denote the set of placements counted by 
s(n, k), s’'(n —1,k —1), and s’(n — 1,k), respectively. Write A as the disjoint union of Ag 
and A;, where A; consists of all elements of A with i wrooks in the top row. As above, 
deletion of the empty top row gives a bijection from Ag to B. On the other hand, we can 
build a typical wrook placement in A, as follows. First, choose the position of the wrook in 
the top row of A, in n — 1 ways. Second, choose any placement of n — k — 1 non-attacking 
wrooks from the set C, and use this wrook placement to fill the bottom n — 1 rows of An. 
These wrooks do not attack the wrook in the first row. By the sum and product rules, 


s!(n,k) = |Al = |Aol + |4i| = |B] + (n — DIC] = s(n — 1, k-1) + (n—Vs'(n- 1h). O 


We can use the recursion and initial conditions to compute the (signed or unsigned) 
Stirling numbers of the first kind. See Figure 2.24, and compare to the computation of 
Stirling numbers of the second kind in Figure 2.21. There is a surprising relation between 
the two arrays of numbers in these figures. Specifically, for any fixed n > 0, consider the 
lower-triangular matrices A = (s(t, j))i<ij<n and B = (S(t, 7))i<ij<n. It turns out that A 
and B are inverse matrices! The reader may check this for small n using Figure 2.21 and 
Figure 2.24. We will prove this fact for all n in §2.13. 


8 


2.12 Linear Algebra Review 


In the next few sections, and at other places later in the book, we will need to use some 
concepts from linear algebra such as vector spaces, bases, and linear independence. This 
section quickly reviews the definitions we will need; for a thorough treatment of linear 
algebra, the reader may consult texts such as Hoffman and Kunze [69]. 


2.67. Definition: Vector Spaces. Given a field F', a vector space over F consists of 
a set V, an addition operation + : V x V — V, and a scalar multiplication operation 
-:FxV-—YV, that satisfy the following axioms. 


Va,yEeV,a+tyeV closure under addition) 
Va,y,z€V, «+ (ytz)=(e@+y)+2 associativity of addition) 

Va,yEeV, e+y=ytau commutativity of addition) 

W0y €EV,Vr EV, 24+0y =x =0viaz existence of additive identity) 

Va €V,J-ave V,x+(-2) =0vy =(-2)+2 (existence of additive inverses) 

Vee FWeEV, c-vEeV closure under scalar multiplication) 


Voe FyVu,w € V, c-(vt+w) =(c-v)+(c-w) (left distributive law) 

Ve,d€ F\Vu eV, (c+d)-v=(c-v)+(d-v) right distributive law) 

Vo,d€ F,Vu €V, (cd)-v =c- (d-v) associativity of scalar multiplication) 
WeEV, 1l-v=v identity property) 


When discussing vector spaces, elements of V are often called vectors, while elements of F 
are called scalars. 


For example, the set F” = {(a1,...,2n) : 2 € F} is a vector space over F' with 
operations 


(@1y +++ 58m) + Ys e225 Yn) = (1 + Yt Bn + Yn); 


C(21,---,In) = (C®1,.--,C@n) (6,24, yi € F). 
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Similarly, the set of polynomials aj + a,2 +--+: + a,x", where all coefficients a; come from 
F,, is a vector space over F' under the operations 


ait’ + S- birt = So (ai + bia"; eye ait’ = S"(cai)a*. 


i>0 i>0 i>0 i>0 i>0 


We consider two polynomials 5); aia? and 57;., bia’ to be equal iff a; = 6; for all i; 
see §7.3 for a more formal discussion of how to define polynomials. 


2.68. Definition: Spanning Sets and Linear Combinations. A subset S of a vector 
space V over F' spans V iff for every v € V, there exists a finite list of vectors v1,...,UK € 
S and scalars cy,...,ck € F with v = cyvy +--+ + ceux. Any expression of the form 
Civ, +--+ + cup is called a linear combination of v1,...,uz. A linear combination must be 
a finite sum of vectors. 


2.69. Definition: Linear Independence. A list (v1,..., vz) of vectors in a vector space 
V over F is called linearly dependent iff there exist scalars c,,...,c, € F such that cyv; + 
+++ + Chvp = Oy and at least one c; is not zero. Otherwise, the list (v1,...,v%) is called 
linearly independent. A set S' C V (possibly infinite) is linearly dependent iff there is a finite 
list of distinct elements of S that is linearly dependent; otherwise, S' is linearly independent. 


2.70. Definition: Basis of a Vector Space. A basis of a vector space V is a set S CV 
that is linearly independent and spans V. 


For example, for any field F’, define e; € F” to be the vector with lp in position 
i and Op in all other positions. Then {e1,...,e,} is a basis for F'”. Similarly, one may 
check that the infinite set S = {1,2,x7,x°,...,2",...} is a basis for the vector space V of 
polynomials in x with coefficients in F’. S spans V since every polynomial must be a finite 
linear combination of powers of «. The linear independence of S follows from the definition 
of polynomial equality: the only linear combination col + c;x + cox? +--+ that can equal 
the zero polynomial is the one where cp = cy = C2 = -:: = Or. We now state without proof 
some of the fundamental facts about spanning sets, linear independence, and bases. 


2.71. Theorem: Linear Algebra Facts. Every vector space V over a field F has a 
basis (possibly infinite). Any two bases of V have the same cardinality, which is called the 
dimension of V and denoted dim(V). Given a basis of V, every vu € V can be expressed 
in exactly one way as a linear combination of the basis elements. Any linearly independent 
set in V can be enlarged to a basis of V. Any spanning set for V contains a basis of V. A 
set S C V with |S| > dim(V) must be linearly dependent. A set T C V with |T| < dim(V) 
cannot span V. 


For example, dim(f”) = n for all n > 1, whereas the vector space of polynomials with 
coefficients in F’ has (countably) infinite dimension. 


(MR 


2.13 Stirling Numbers and Polynomials 


In this section, we use our recursions for Stirling numbers (of both kinds) to give algebraic 
proofs of certain polynomial identities. We will see that these identities connect certain 
frequently used bases of the vector space of one-variable polynomials. This linear-algebraic 
interpretation of Stirling numbers will be used to show the inverse relation between the two 
triangular matrices of Stirling numbers (cf. Figures 2.21 and 2.24). The following section 
will give combinatorial proofs of the same identities using rook theory. 
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2.72. Theorem: Polynomial Identity for Stirling Numbers of the Second Kind. 
For all n > 0 and all real z, 


= S> S(n, k)a(a — 1)(@ — 2)--- (ek +1). (2.6) 
k=0 

Proof. We give an algebraic proof here; the next section gives a combinatorial proof using 
rook theory. Recall that $(0,0) = 1 and S(n,k) = S(n-1,k -1)+kS(n—-1,k) forn>1 
and 1<k <n. We prove the stated identity by induction on n. If n = 0, the right side is 
(0,0) = 1 = 2°, so the identity holds. For the induction step, fix n > 1 and assume that 
a?! = 27) S(n—1,k)a(a —1)--- (a —k +1). Multiplying both sides by x = («x —k) +k, 
we can write 


ee = y S(n—1,k)a(a@—1)--- (a —k+1)(a—k) 
k=0 
+3 Si—1,8)o(e— i) (=k +1) 
k=0 


= S(n—1,j)a(w—1)-+-(@— 7) + S5 kS(n —1,k)a(w -1)-+-(@—k +1). 
j=0 k=0 


In the first summation, replace 7 by k — 1. The calculation continues: 


LoS Sone Awe sect ha) 
k=1 

es kS(n—1,k)a(a—1)---(w@—k4+1) 
k=0 

= S>S(n-1,k-1)a(@—1)---(e-k +1) 
k=0 


+S 7 kS(n —1,k)a(z —1)---(a —k +1) (since S(n — 1,-1) = S(n —1,n) = 0) 
k=0 


= ) [S(n-1,k-1)+kS(n—-1,k)|2(@ -1)---(@-k +1) 


= os S(n,k)a(a@ —1)---(a—k+1) (using the recursion for S(n,k)). O 


2.73. Theorem: Polynomial Identity for Signless Stirling Numbers of the First 
Kind. For all n > 0 and all real x, 
x(a + 1)(a+2)---(2+n-1) => s'(n,k)2*. (2.7) 
k=0 
Proof. Recall that s’(0,0) = 1 and s’(n,k) = s'(n—-1,k -—1)+(n—1)s'(n—-1,k) forn >1 
and 1 < k <n. We use induction on n again. If n = 0, both sides of (2.7) evaluate to 1. 
For the induction step, fix n > 1 and assume that 
n-1 
u(a+1)\(~+2)---(a@+n-2 oe (n—1,k)x 
k=0 
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Multiply both sides of this assumption by x +n — 1 and compute: 


n-1 
u(a@+1)---(@+n—-1) = s'(n—1,k)a*(a2 +n—1) 

k=0 
n-1 n-1 

= So s'(n-1, kak! So(n 1)s'(n — 1, k)x* 
k=0 k=0 
n n-1 

— So sl(n =—1,k-1)2* + So(n —1)s'(n—1,k)a* 
k=1 k=0 

= J _[s'(n-1,k-1)+(n-1)s'(n-1,k)]a* 
k=0 


2.74. Theorem: Polynomial Identity for Signed Stirling Numbers of the First 
Kind. For all n > 0 and all real x, 


x(a —1)(%—2)-+-(2-n+1)=5_ s(n, k)2*. (2.8) 


Moving the (—1)” to the right side and recalling that s(n, k) = (—1)"**s'(n, k), the result 
follows. O 


2.75. Theorem: Summation Formulas for Stirling Numbers of the First Kind. 
For alln > 1 and 1<k <n, we have 


s'(n,k) = ¥: tig +++ ink. 


1<i1<ig<-+<in-p<n-1 


Proof. We give an algebraic proof and a combinatorial proof of this result. 
Algebraic Proof. We apply the generalized distributive law to the left side of the identity 


n 

(a + 0)(2 + 1)(e@+2)---(e+n-1)= 5° s'(n,k)a*. 
k=0 

According to the distributive law, the left side expands to a sum of terms obtained by 

choosing either x or i from each factor (for 0 < i < n) and multiplying the chosen terms 

together. To obtain a contribution to the coefficient of x*, we must choose x exactly k times 

and choose a number 7 exactly n — k times. Adding up all these contributions gives the 
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coefficient of x*, namely s’(n,k). The term i,i2---in_z is the contribution from the choice 
sequence where we choose 7; from the factor (a + 71), 72 from the factor (a + iz), etc., and 
choose x from all factors (x + 7%) with i different from all i,’s. 

Combinatorial Proof. Recall that s’(n,k) counts the number of placements of n — k 
non-attacking wrooks on the triangular Ferrers board A,,. Since wrooks only attack cells in 
their rows, a placement is non-attacking iff all wrooks occupy distinct rows of A,,. Let us 
classify wrook placements based on which rows contain wrooks. Suppose the n — k wrooks 


appear in the rows of lengths 21, 72,...,%,_,4, where 1 <i, < tg <+++ <in_p <n—1. The 
product rule shows that the number of placements of wrooks in these rows is 7179 -+-in—p. 
The formula in the theorem now follows from the sum rule. O 


2.76. Definition: Special Bases for Polynomials. Let V be the vector space of all 
polynomials in one variable x with real coefficients. For any integer n > 0, introduce the 
falling factorial polynomials 


(z)lo=1, ()ln= a(x —-1)(a@ — 2)---(w@ -—n +1). 
Similarly, the rising factorial polynomials are defined by 
(x)fo=1, (a)fn= x(a t+1)(a@4+2)---(a@+n-1). 


The monomial basis of V is the indexed set M = {x" : n > 0}. The falling factorial basis 
of V is F = {(a)|n:n > 0}. The rising factorial basis of V is R = {(x)tn: n > Of. 


It is a routine exercise to prove that any indexed collection of polynomials {p,(x) :n > 
0} such that deg(p,) = n for all n is a basis of V. Since x”, (x)|n, and ()t, all have degree 
n, it follows that M, F, and R really are bases of V. Define My = {a":0<n< N}, and 
define Fy and Ry similarly. The three indexed collections My, Fy, and Ry are all bases 
of the vector space Vy of polynomials in x of degree at most N. 

We can now recast the preceding theorems in the language of linear algebra. Recall that 
if B = (v1,...,Un) and C = (wj1,...,W,) are two ordered bases of a finite-dimensional 
vector space W, the transition matrix from B to C is the unique n x n matrix A = (a;;) 
such that 


n 
i=l 
This matrix is so named because if v € W has coordinates [v]g = (s1,...,8n)? relative 


to the basis B (ie., v = yy 8;U;), then the coordinates of v relative to the basis C' are 
given by [v]c = A[v]g. Thus, multiplication by A transforms coordinates relative to B into 
coordinates relative to C. From linear algebra, we know that A is invertible, and A~? is 
none other than the transition matrix from C' to B. 


2.77. Theorem: Transition Matrices between Polynomial Bases. Fix N > 0, and 
write My = (a":n< WN), Fy = ((a)lnin< N), and Ry = ((#)fn:n < N), as above. 

(a) The matrix S = (S(n,k))o<n,n<n of Stirling numbers of the second kind is the transpose 
of the transition matrix from the basis My to the basis Fy of the vector space Vy of 
polynomials of degree at most N. 

(b) The matrix s’ = (s'(n,k))o<n,k<w of signless Stirling numbers of the first kind is the 
transpose of the transition matrix from the basis Ry to the basis My of Vy. 

(c) The matrix s = (s(n,k))o<n,e<n of signed Stirling numbers of the first kind is the 
transpose of the transition matrix from the basis Fy to the basis My of Vy. 

(d) The (N + 1) x (N +1) matrices S and s are inverses of one another. 
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Proof. The first three statements follow from equations (2.6), (2.7), (2.8), and the defini- 
tion of transition matrices (2.9). The final statement is a special case of the fact that the 
transition matrix from B to C is the inverse of the transition matrix from C' to B. O 


Part (d) of the theorem says that we have matrix identities Ss = I = sS, where I is the 
(N +1) x (N +1) identity matrix. Writing out what this means entry by entry, we obtain 
the formulas 


5 Si, k)s(k, i) = XE = 3) = D0 8G, k)S(K 5) (4,5 2 0). 
k k 


A combinatorial proof of the second equality will be given later (see 4.6). 


2.14 Combinatorial Proofs of Polynomial Identities 


This section gives combinatorial proofs of some of the polynomial identities in the previous 
section. We use these proofs to illustrate a common technique in combinatorics in which we 
verify that a polynomial identity holds for all « by proving (combinatorially or otherwise) 
that the identity holds for sufficiently many particular values of the variable «x. 

Let us introduce this technique through a specific example. 


2.78. Theorem: Stirling Numbers of the First Kind Revisited. For all nonnegative 
integers x and all n > 0, we have 
a(a + 1)(a+2)---(2+n—-1)= 5° s'(n,k)z*. (2.10) 
k=0 


Proof. Recall that s’(n,k) counts placements of n — k non-attacking wrooks on the Ferrers 
board A, = dg(n—1,n—2,...,1,0). Fix an integer x > 0 and consider the extended Ferrers 
board A, (x) = dg(a+n—1,a+n-—2,...,u+1,2). For example, 


A;(3) = 


Call the squares in the first « columns of this board new squares. Let A be the set of 
placements of n non-attacking wrooks on the board An(x). Note that every row of the 
board must be occupied by exactly one wrook. If we place the wrooks on the board one row 
at a time, working upwards from the bottom row, the product rule yields 


|A] = a(@ + 1)(x + 2)---(a@+n—-1). 


On the other hand, we can write A as the disjoint union of sets Az, where A; consists of 
those placements 7 € A in which exactly k wrooks occupy new squares. To build a placement 
ma € Ax, first place n — k non-attacking wrooks in the old squares in s’(n,k) ways. There 
are now k unused rows, each of which has x new squares, and k wrooks left to be placed. 
Visit each of these rows (say from top to bottom), and choose one of the « squares to be 
occupied by a wrook. By the product rule, |Ax| = s’(n, k)x2*. The sum rule now gives 


n 


|A| = 7 8'(n, ka, 


k=0 
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and the theorem follows. O 


Comparing the combinatorial proof in 2.78 to the algebraic proof in 2.73, a subtle dif- 
ference emerges: the combinatorial proof is valid only for nonnegative integers x, while the 
algebraic proof is valid for all real x (or even for formal polynomials in any polynomial ring 
F'|a], as defined in §7.3). The following result shows that our combinatorial proof is equally 
as good as the algebraic proof. 


2.79. Theorem: Verifying Polynomial Identities. Suppose F is a field and p,q are 
two polynomials in F'[a]. Say p and q have degree at most N. If p(c) = q(c) for N+ 1 
elements c € F’, then p = q in the polynomial ring Fz], and hence p(c) = gq(c) for all 
c € F. In particular, if two real polynomials agree at each nonnegative integer, then the 
two polynomials are identical. 


Proof. Consider the polynomial p — q € Fa]. If this polynomial is nonzero, its degree is 
at most N. A well-known fact from algebra asserts that a nonzero polynomial of degree at 
most N has at most N roots in F (cf. 2.157 and 12.147). Our hypothesis says that p — q 
has N +1 roots in F’. Therefore, p — q = 0 in F[a]. Evaluating p — q at any field element c, 
it follows that p(c) = q(c). Oo 


2.80. Example. To apply 2.79 to 2.78, fix n. The left side of (2.10), namely p = x(a + 
1)---(c +n —1), is a polynomial in x of degree n. The right side ¢ = S09 8’(n, k)a* 
is also a real polynomial in x of degree n. The wrook-theoretic proof in 2.78 showed that 
p(m) = q(m) for all integers m > 0. Hence, p = g € R{[z] and thus p(r) = q(r) for all real r. 


This type of argument involving 2.79 occurs so commonly that we will seldom spell 
out the details in the future. However, one must remember to check that both sides of the 
identity in question are polynomials in the variable z. 


2.81. Theorem: Stirling Numbers of the Second Kind Revisited. For all integers 
n > 0 and all real x, 
n 
x” = S~ S(n,k)x(x —1)(e —2)-+-(w@-k +1). (2.11) 
k=0 
Proof. Both sides of the identity are polynomials in x, so it suffices to verify the identity 
when zx is a nonnegative integer. Fix x > 0 and n > 0. Let A be the set of placements 
of n non-attacking rooks on the extended Ferrers board A,(x) = dg(v+n—l,a+n 
2,...,2+1,x). We can build a placement 7 € A by placing one rook in each row, working 
from bottom to top. The rook in the bottom row can go in any of x squares. The rook in the 
next row can go in any of (a + 1)—1 = x squares, since one column is attacked by the rook 
in the bottom row. In general, the rook located 7 > 0 rows above the bottom row can go in 
(a +7) —i =a squares, since 7 distinct columns are already attacked by lower rooks when 
the time comes to place the rook in this row. The product rule therefore gives |A| = x”. 
On the other hand, A is the disjoint union of the sets A; consisting of all 7 € A with 
exactly k rooks in the new squares and n — k rooks in the old squares. (Recall that the 
new squares are the leftmost x columns of A,,(x).) To build 7 € Ax, first place n — k non- 
attacking rooks in A,, in any of S(n,k) ways. There are now k unused rows of new squares, 
each of length x. Visit these rows from top to bottom (say), placing one rook in each row. 
There are x choices for the first rook, then x — 1 choices for the second rook (since the first 
rook’s column must be avoided), then x — 2 choices for the third rook, etc. The product 
rule gives |A,| = S(n, k)a(a — 1)(a — 2)--- (a —k +1). Hence, by the sum rule, 


|A| = $2 S(n, k)a(@ — 1)(# — 2)-+-(@—k +1). 
k=1 
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The result follows by comparing our two formulas for | A]. Oo 


2.82. Remark. The proof technique of using extended boards such as A, (a) can be reused 
to prove other results in rook theory. See §12.3. 


We can also prove polynomial identities in several variables by verifying that the iden- 
tity holds for sufficiently many values of the variables. As an example, we now present a 
combinatorial proof of the binomial theorem. 


2.83. Combinatorial Binomial Theorem. For all integers n > 0, 


(a t+y)" = 3 (;) a*y-* in R[x, y]. 


k=0 


Proof. Let p = (x + y)" € R[z,y] and q = Ty (Z)a*y"—* € Riz, y]. We first show that 
p(t, 7) = g(t, 7) for all nonnegative integers i, 7 > 0. Fix such integers 7 and j. Consider an 
alphabet A consisting of i consonants and j vowels. Let B be the set of all n-letter words 
using letters from A. The product rule gives |B] = (i+ 7)” = p(i,j). On the other hand, 
B is the disjoint union of sets By (for 0 < k <n) where By consists of the words in B 
with exactly k consonants. To build a word w € Bx, first choose the positions for the k 
consonants out of the n available positions in es) ways. Choose the & consonants in these 
positions from left to right (i ways each), and then choose the n— k vowels in the remaining 
positions from left to right (j ways each). The product rule gives |B;| = (7)i*j"~*. Hence, 
the sum rule gives |B] = q(i, 7). 

We complete the proof by invoking 2.79 twice. First, for each nonnegative integer i > 0, 
the polynomials p(i, y) and q(z, y) in R[y] agree for infinitely many values of the formal vari- 
able y. So, p(t, y) = g(t, y) in R[y] and also in R(y) (the field of fractions of the polynomial 
ring R[y], cf. 7.44). Now regard p and gq as elements of the polynomial ring R(y)[z]. We have 
just shown that these polynomials (viewed as polynomials in the single variable x) agree in 
R(y) for infinitely many values of «. Hence, p = q in R(y)[z], and so p = q in R[z, y]. This 
argument generalizes to polynomial identities in any number of variables (see 2.158), so we 
will omit the details in the future. O 


Summary 


e Generalized Distributive Law. To multiply out a product of factors, where each factor is 
a sum of terms, choose one term from each factor, multiply these choices together, and 
add up the resulting products. Formally, this can be written: 


Il ( S- ou = S- (11 on (all x;,,; lie in a ring R). 
k=1 (i In \k=1 


in€ln by y.-y4n JET, XX 


If A,,..., A, are finite subsets of R, we can also write 


wiEA, w2€ A2 WnEAn JE Ax Aox- XK An 
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e Multinomial and Binomial Theorems. In any ring R (possibly non-commutative), 


(21 + 2e-+ee +e)" = S- Zi Sen ur, (s,n E Nt, z; € R). 
words w€{1,...,s}” 


If 2:2; = zjz for alli, 7, this becomes 


: n niin 
(2itaetes +2)? = ( 4) ae 
2f%s 


nytnet-+ns=n ee 


If cy = yx, we have the binomial theorem 


e Combinatorial Proofs. To prove a formula of the form a = b combinatorially, one can 
define a set S of objects and give two counting arguments showing || = a and || = 8, 
respectively. To prove a polynomial identity p(x) = q(x), it suffices to verify the identity 
for infinitely many values of the variable x (say for all nonnegative integers x). Similar 
comments apply to multivariable polynomial identities. 


e Identities for Binomial Coefficients. 
n n 2 a 
n n n a n 2n at+b\ k+b—1)\. 
GaGa Ble EG) Cl) (lh) - EGA} 
k=0 k=0 k=0 


a+b+c+t+l1 a k+b a k+e 
= hu- Vand de f ] 
ee Gs ee (Chu- Vandermonde formula) 


n n—-1 n—-1 , 
& = ( . ) + ' 7 i) (Pascal recursion). 


e Combinatorial Definitions. An integer partition of n into k parts is a weakly decreasing 
list w = (141, o..., Ue) of positive integers that sum to n. A set partition of a set S into 
k blocks is a set P of k nonempty, pairwise disjoint sets with union S. An equivalence 
relation on S is a reflexive, symmetric, transitive relation on S. The map that sends an 
equivalence relation to the set of its equivalence classes defines a bijection from the set 
of equivalence relations on S to the set of set partitions of S. 


e Notation for Combinatorial Objects. Table 2.1 indicates notation used for counting some 
collections of combinatorial objects. 


e Recursions. A collection of combinatorial objects can often be described recursively, by 
using smaller objects of the same kind to build larger objects. Induction arguments can 
be used to prove facts about recursively defined objects. Table 2.2 lists some recursions 
satisfied by the quantities in Table 2.1. In each case, the recursion together with appro- 
priate initial conditions uniquely determine the quantities under consideration. If two 
collections of objects satisfy the same recursion and initial conditions, one can link to- 
gether two combinatorial proofs of the recursion to obtain recursively defined bijections 
between the two collections. 
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TABLE 2.1 
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Notation for counting combinatorial objects. 

Here f, is a Fibonacci number; B(n) is a Bell number; S(n,k) is a Stirling number of the 
second kind; s’(n, k) is a signless Stirling number of the first kind; T(a, b) is a ballot number; 
Tm (a, b) is an m-ballot number; and C,, is a Catalan number. 


Notation 


What it counts 


Cin,k) = (¢ 


L(a,b) = (Wo) 


n 


M(n,k) = (ra) 


B(n) 


S(n, k) 


Surj(n,k) = k!S(n,k) 


s'(n, k) 

s(n, k) = (—1)"**s'(n, k) 

p(n) 

inf) b—atl (atb+1 

T(a,b) = Brati ( Th) 

Tm (a, b) = Paiea tnt 
2n 

Ch = T (n,n) = eae) 


’ 


) 


k-element subsets of an n-element set 
anagrams of the word aj --- as 

lattice paths from (0,0) to (a, 6) 

words in {0,1}" not containing 00 

k-element multisets of an n-element set 

set partitions of an n-element set; 

equivalence relations on an n-element set 

set partitions of {1,...,} into & blocks; 

equiv. relations on {1,...,n} with k equiv. classes; 
placements of n — k non-attacking rooks on A, 
surjections from {1,...,n} to {1,...,k} 
placements of n — k non-attacking wrooks on A,; 
permutations of n objects with k cycles (§3.6) 
Stirling numbers of first kind (with signs) 
integer partitions of n 

integer partitions of n into k parts 

lattice paths from (0,0) to (a,b) that 

do not go below y = x (b> a) 

lattice paths from (0,0) to (a,b) that 

do not go below y = mx (m € Nt,b > ma) 
Dyck paths ending at (n,n); 

binary trees with n vertices; 

231-avoiding permutations; etc. 
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TABLE 2.2 
Some combinatorial recursions. 


O(n,k) = C(n—1,k-1)+C(n-1,8) 
C(n;n1,...,Ms) = S>C(n—1;11,...,me — 1,...,M6) 
k=1 
Fn = Tnat + fn—2 
M(n,k) = M(n—1,k)+M(n,k-1) 
L(a,b) = L(a—1,b)+L(a,b-1) 
T(a,b) = T(a-—1,b)+T(a,b-1)x(b-—12 a) 
Ca = 5 G.aixs 
k=1 
p(n,k) = p(n—1,k—1)+p(n—k,k) 
hee Ley | ( - mG) +0/ or) 
SiR SL SHHLRADEBRE LO 
n—-1 eel 
B(n) = a ‘ )Bqn-1-8) 
Surj(n,k) = kSurj(n-1,k-—1)+kSurj(n - 1,k) 
s(n,k) = s'(n—1,k—1)4(n—1)s'(n—1,k) 


e Polynomial Identities for Stirling numbers. Define rising and falling factorials by (x)[n= 
xu(a + 1)(@ + 2)---(@ +n—-1) and (#)|n= x(a — 1)(a — 2)--- (a —n +1). The sets 
{z” : n> Of, {(x) Tn: n > OF, and {(x%)|n: n > O} are all bases of the vector space of 
real polynomials in one variable. The Stirling numbers are the entries in the transition 
matrices between these bases. More specifically, 


gp’ = S 5 S(n, k)(a)L a} (t)tn= S¢ s'(n, k)a*; (t)ln= S © s(n, k)a*. 
k k k 
So the matrices S = (S(n,k))n,, and s = (s(n, k))n,x are inverses of each other, i.e., 


YS" Si, k)s(k, 3) = x= 9) = D5 8G, k)S(k,9) (4 > 0). 
k 


k 


Exercises 


2.84. Simplify the product (B+C+H)(A+E+U)(R+T), where each letter denotes an 
arbitrary n x n real matrix. 


2.85. Expand (A+ B+ C)?, where A, B,C are n x n matrices. 
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2.86. Find the coefficient of w?2? yz? in (wt+a+y+z)%, assuming w, 2, y, and z commute. 
2.87. Expand (32 — 2)° into a sum of monomials. 

2.88. Find the constant term in (22 — «~')®. 

2.89. Find the coefficient of x? in (x? + a+ 1). 


2.90. Prove algebraically that }7;_(—1)*(%) = x(m = 0) and S77, 2*(%) = 3” for all 
n > 0. Can you find combinatorial proofs? 


2.91. Given n € N*, evaluate loc; cren (4) (4): 
2.92. Given m,n € N*, evaluate 04, 4454--4h,,<n(Kilka! ++: km!)7". 


2.93. Use Pascal’s recursion to compute (?) forO<k<9 and G@) forO<k< 10. 


2.94. Give a proof of the recursion 2.27 for multinomial coefficients based on multidimen- 
sional lattice paths. 


2.95. Compute the ballot numbers Ta, 7) for 0 < a < 7 by drawing a picture. 


2.96. For fixed k € Nt, let a, be the number of n-letter words using the alphabet 
{0,1,...,4} that do not contain 00. Find a recursion and initial conditions for ay. 


2.97. How many words in {0,1,2}° do not contain 000? 


2.98. How many lattice paths from (1,1) to (6,6) always stay weakly between the lines 
y = 2/5 and y = 52/2? 


2.99. How many lattice paths go from (1,1) to (8,8) without ever passing through a point 
(p,q) such that p and gq are both prime? 


2.100. Show that C,, counts integer partitions y such that dg(u) C Ay. 


2.101. Draw pictures of all integer partitions of n = 6 and n = 7. Indicate which partitions 
are conjugates of one another. 


2.102. Compute p(8,3) by direct enumeration and by using a recursion. 
2.103. Use Euler’s recursion to compute p(k) for 13,14, 15,16 (see Figure 2.19). 


2.104. (a) Write down all the set partitions and rook placements counted by $(5,2). (b) 
List all the set partitions and equivalence relations counted by B(4). (c) Draw all the wrook 
placements counted by s’(4, 2). 


2.105. Compute S(9,k) for 0 << k <9 and S(10,k) for 0 < k < 10 (use Figure 2.21). 
2.106. Compute the Bell number B(k) for & = 9,10, 11, 12 (use Figure 2.21). 
2.107. Compute s(8,k) for 0 < k < 8 (use Figure 2.24). 


2.108. (a) Find a combinatorial proof of the formula )7"_, i = n(n + 1)/2. (b) Can you 
prove )7j_, 7? = n(n + 1)(2n + 1)/6 combinatorially? 


2.109. Prove the identity k(%) = a) =(n—k+ Ls) algebraically and combinato- 
rially (where 1 <k <n). 
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2.110. Suppose X is an n-element set. Count the number of: (a) relations on X; (b) reflexive 
relations on X; (c) irreflexive relations on X; (d) symmetric relations on X; (e) irreflexive 
and symmetric relations on X; (f) antisymmetric relations on X. 


2.111. Let X be a nine-element set and Y a four-element set. (a) How many functions map 
X into Y? (b) How many functions map Y into X? (c) How many surjections are there 
from X onto Y? (d) How many injections are there from Y into X? 


2.112. Verify equations (2.6), (2.7), and (2.8) by direct calculation for n = 3 and n = 4. 
2.113. (a) Find the rook placement associated to the set partition 
{{2, 5}, {1,4, 7, 10}, {3}, {6, 8}, {9}} 


by the bijection in 2.64. (b) Find the set partition associated to the following rook placement: 


2.114. Let f : {1,2,...,7} — {1,2,3} be the surjection given by f(1) = 3, f(2) = 3, 
f(3) =1, f(4) = 3, f(5) = 2, f(6) = 3, f(7) = 1. In the second proof of 2.58, what choice 
sequence can be used to construct f? 


2.115. How many compositions of 20 only use parts of sizes 1, 3, or 5? 


2.116. Use the recursion 2.26 for multisets to prove by induction that the number of k- 


element multisets using an n-element alphabet is M(n,k) = ae 


2.117. Given a,b,c,n € N* with a+6+c¢ = n, prove combinatorially that Fe) = 
—k-1 —k-1 
ee, (eeteee ai Raa 5 


2.118. Complete the proof of 2.31 by proving Tin(a, b) = eer by induction. 


a 


2.119. Show that |S7| = C,, for (a) 7 = 132; (b) 7 = 213; (c) r = 312. (d) Convert the 
binary tree in Figure 2.12 to a 7-avoiding permutation for each of these choices of 7. 


2.120. (a) Let G,, be the set of lists of integers (go, 91,---,9Gn—1) where go = 0, each g; > 0, 
and gi41 < g, +1 for alli < n—1. Prove that |G,| = Cp. (b) For m € Nt, let Gh be 
the set of lists of integers (go, 91,---;Gn—1) Where go = 0, each g; > 0, and gi41 < gi +m 
for all i < n—1. Prove that iGo” = T(n,mn), the number of lattice paths from (0,0) to 
(n,mn) that never go below the line y = mz. 


2.121. Consider the 231-avoiding permutation w = 15 243 1176 10 8 9. Use recursive 
bijections based on the Catalan recursion to map w to objects of the following kinds: (a) a 
Dyck path; (b) a binary tree; (c) a 312-avoiding permutation (see 2.119); (d) an element of 
Gy, (see 2.120). 


2.122. Let a be the Dyck path NNENEENNNENNENNEEENENEEE. Use recursive bi- 
jections based on the Catalan recursion to map 7 to objects of the following kinds: (a) a 
binary tree; (b) a 231-avoiding permutation; (c) a 213-avoiding permutation (see 2.119). 
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2.123. Show that the number of possible rhyme schemes for an n-line poem using k different 
rhyme syllables is the Stirling number S(n,k). (For example, ABABCDCDEFEFGG is a 
rhyme scheme with n = 14 and k = 7.) 


2.124. (a) Find explicit formulas for S(n,k) when &k = 1, 2, n—1, and n. Prove your 
formulas using counting arguments. (b) Repeat part (a) for Surj(n, k). 


2.125. Give a combinatorial proof of the identity kS(n, k) = Deyet (")S(n—j, k—1), where 
1l<k<n. 


2.126. Prove Cn = ren: o<p<n/2 2 (kin — k)? for n > 1. 


2.127. Consider lattice paths that can take unit steps up (N), down (S), left (W), or right 
(E), with self-intersections allowed. How many such paths begin and end at (0,0) and have 
10 steps? 


2.128. Use 2.52, 2.63, and the ideas in 2.39 to give a recursive definition of a bijection 
between rook placements counted by S’(n,k) and set partitions counted by S(n,k). Is this 
bijection the same as the bijection described in 2.64? 


2.129. Fix n € N*, let yz be an integer partition of length @(j:) <n, and set pz = 0 for 
L(u) <k <n. Let s’(u,k) be the number of placements of n — k non-attacking wrooks on 
the board dg(y). (a) Find a summation formula for s’(w,k) analogous to 2.75. (b) Prove 
that 


(a + jr)(@+ p2)---(@ + tn) = D> s!(u, ke. 


k 
(c) For n = 7 and p = (8,5,3,3, 1), find s’(u,k) for 0 <k < 7. 


2.130. (a) Show that the Fibonacci number f,—-1 (see 2.23) is the number of compositions 
of n in which every part has size 1 or 2. (b) Show that f, is the number of subsets of 
{1,2,...,n} that do not contain two consecutive integers. (c) Combine (b) with 1.113 to 
deduce a summation formula for fy. 


2.131. (a) Show that the sequence a, = fan (see 2.23) satisfies the recursion a, = 3an—1 — 
Qn—2 for n > 2. What are the initial conditions? (b) Show that a, is the number of words 
in {A, B,C}” in which A is never immediately followed by B. 


2.132. For n > 0, let a, be the number of words in {1,2,...,&4}” in which 1 is never 
immediately followed by 2. Find a recursion satisfied by the sequence (a, : n > 0), and 
prove it with a suitable bijection. 


2.133. (a) Give algebraic and combinatorial proofs of the identity 


xg’ —1l=(@-1)1+(@-le+(e—-1)a?+---+(¢-1)2"" (x € R). 
(b) Deduce a formula for }>>_, x’”, valid for real numbers x with |z| < 1. 


2.134. Define Fo = 0, Fi) = 1, and Fy, = Fr-1+Fy-2 for alln > 1 (so fp = Frye for n > 0). 
Give algebraic or combinatorial proofs of the following formulas. (a) F,, = (¢" — w”)/V5, 
where ¢ = (1+ V5)/2, = (1— V5)/2. (b) 70g Fe = Fata —1. (c) 0005 Fanti = Fan: 
(CQ) Spon = Pog 


2.135. Give a combinatorial proof of equation (2.11) by interpreting both sides as counting 
a suitable collection of functions. 
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2.136. Let C,,,, be the number of Dyck paths of order n that end with exactly k east steps. 


Prove the recursion 
n—k 
k-l14+r 
Cn,k ae S- ( Tose lr } ont 


r=1 


2.137. Let p be prime. Prove that (2) is divisible by p for 0 < k < p. Can you find a 
combinatorial proof? 


2.138. Fermat’s Little Theorem states that a? =a (mod p) for a € N* and p prime. Prove 
this by expanding a? = (1+ 1+---+1)? using the multinomial theorem (cf. 2.137). 


2.139. Ordered Set Partitions. An ordered set partition of a set X is a sequence P = 
(T1, To,..., T,) of distinct sets such that {T, To,...,7,} is a set partition of X. Let B,(n) be 
the number of ordered set partitions of an n-element set. (a) Show Bo(n) = 3 77_, k!S(n, k) 
for n > 1. (b) Find a recursion relating B,(n) to values of B,(m) for m < n. (c) Compute 
B,(n) forO<n< 5. 


2.140. (a) Let By(n) be the number of set partitions of an n-element set such that no block 
of the partition has size 1. Find a recursion and initial conditions for B,(n), and use these 
to compute By(n) for 1 <n < 6. (b) Let $i(n,k) be the number of set partitions as in (a) 
with & blocks. Find a recursion and initial conditions for S,(n, k). 


2.141. Let pa(n,k) be the set of integer partitions of n with first part k and all parts 
distinct. Find a recursion and initial conditions for pa(n, k). 


2.142. Let p(n, k) be the set of integer partitions of n with first part k& and all parts odd. 
Find a recursion and initial conditions for p(n, k). 


2.143. Let qg(n,k) be the number of integer partitions y of length k and area n such that 
nw’ = pu (such partitions are called self-conjugate). Find a recursion and initial conditions for 


q(n, k). 


2.144. Verify the statement made in 2.76 that any indexed collection of polynomials 
{pn(z) : n > 0} such that deg(p,) = n for all n is a basis for the real vector space of 
polynomials in one variable with real coefficients. 


2.145. Verify the following statements about transition matrices from §2.13. (a) If B and 
C are ordered bases of W and A is the transition matrix from B to C, then [v]c = Alvlz 
for all v € W. (b) If A is the transition matrix from B to C, then A is invertible, and A7! 
is the transition matrix from C' to B. 


2.146. Complete the proof of 2.55 by verifying that: (a) ¢(P) € B for all P € A; (b) 
¢'(R) € A for all RE B; (c) Go = idg; (d) # og = idy. 


2.147. Consider a product x1 X 22 X +++ @, where the binary operation x is not necessarily 
associative. Show that the number of ways to parenthesize this expression is the Catalan 
number C,_;. For example, the five possible parenthesizations when n = 4 are 


(((a1 X #2) X #3) X x4), ((@1 X @2) X (HZ X 4)), (X41 X ((L2 X Hg) X @4)), 


(a1 Xx (a2 x (a3 X @4))), ((@1 X (a2 X &3)) X 24). 


2.148. Generalized Associative Law. Let x be an associative binary operation on a 
set S' (so (uw x y) xX z = ax (y x z) for all z,y,z € S). Given x,...,%n € S, recursively 
define [ess oy = oy and [[2, a= yee Xj) X Ln. Prove that every parenthesization of 
the expression 71 X © X +++ X @, evaluates to [];_, x; (use strong induction). This result 
justifies the omission of parentheses in expressions of this kind. 
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2.149. Generalized Commutative Law. Let + be an associative and commutative bi- 
nary operation on a set S. Prove that for any bijection f : {1,2,...,n}— {1,2,...,n} and 
any elements £1,...,%, € S, 


By lg Ae ee aly = gy OR pay es eae) 


2.150. For each positive integer n, let Z, = {0,1,2,...,n — 1}. Define binary operations 
® and ® on Z,, by letting a 6 b = (a+ b) mod n and a @ b = (a- b) mod n, where c mod n 
denotes the unique remainder in Z,, when c is divided by n. (a) Prove that Z,, with these 
operations is a commutative ring. (b) Prove that Z,, is a field iff n is a prime number. 


2.151. Let R be a ring, and let M,,(R) be the set of all n x n matrices with entries in R. 
Define matrix addition and multiplication as follows. Writing A(i,7) for the 7, j-entry of a 
matrix A, let (A + B)(i,j) = A(t,j) + B(i,j) and (AB)(i, 7) = T7_, A(i, k) B(k, 7). Verify 
the ring axioms in 2.2 for M,,(R). Show that this ring is non-commutative whenever n > 1 
and |R| > 1. If R is finite, what is |M,(R)|? 


2.152. Let R be aring, and let R[z] be the set of all one-variable “formal” polynomials with 
coefficients in R. Define polynomial addition and multiplication as follows. If p = }0,.9 axa" 
and q = di>0 bjv? with a;,b; € R (and a; = 0,b; = 0 for large enough i, 7), set p+ q¢ = 
(ai + b;)a" and pg = >, chx®, where cy = agby + a1be_-1 +-+++axbo. Also, by definition, 
p=4q means a; = 0; for all i > 0. Verify the ring axioms in 2.2 for R{z]. 


2.153. (a) Let R be the set of all functions f : Z— Z. Given f,g € R, define fOg:Z—-Z 
by (f@g)(n) = f(n)+g(n) for all n € Z, and define fog by (fog)(n) = f(g(m)) (composition 
of functions). Show that (R, @,°) satisfies all the ring axioms in 2.2 except commutativity 
of multiplication and the left distributive law. (b) Let S be the set of f € R such that 
f(m+n) = f(m) + f(n) for all m,n € Z. Prove that (S,@,0) is a ring. 


2.154. Prove the binomial theorem 2.14 by induction on n. Mark each place in your proof 
where you use the hypothesis ry = yz. 


2.155. Prove the commutative multinomial theorem 2.12 using the binomial theorem and 
induction on s. 


2.156. Let f,g be smooth functions of « (which means f and g have derivatives of all 
orders). Recall the product rule: D( fg) = D(f)g + fD(g), where D denotes differentiation 
with respect to x. (a) Prove that the nth derivative of fg is given by 


n 


D™(fo) = >> (7) DAD"). 


k=0 


(b) Find and prove a similar formula for D" (fi fo--- fs), where fi,..., fs are smooth func- 
tions. 


2.157. (a) Given a field F’, an element c € F’, and a polynomial p € F'[z], show that p(c) = 0 
iff ¢—c divides p in F'[x]. (b) Show that if p € Fa] is a nonzero polynomial with more than 
N roots in F’, then deg(p) > N. (c) Show that (b) can fail if F is a commutative ring that 
is not a field. 


2.158. Let p,g € Ri[m,...,2%] be multivariable polynomials such that p(m1,...,7m™n) = 
q(m1,--.,™Mn) for all m1,...,™M™n € N*. Prove that p = q. 


2.159. (a) Give a combinatorial proof of the multinomial theorem 2.12, assuming that 
21,22,;---,2%s5 are positive integers. (b) Deduce that this theorem is also valid for all 
Z1,---,25 ER. 
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2.160. Let A, be the set of lattice paths from (0,0) to (n,n) that take exactly one north 
step below the line y = x. What is |A,,|? 


2.161. Prove: for n € N, 775 ) (oe) = 4", 


Notes 


The book by Gould [58] contains an extensive, systematic list of binomial coefficient iden- 
tities. More recently, Petkovsek, Wilf and Zeilberger [104] developed an algorithm, called 
the WZ-method, that can automatically evaluate many hypergeometric summations (which 
include binomial coefficient identities) or prove that such a summation has no closed form. 
For more information on hypergeometric series, see Koepf [80]. 

A wealth of information about integer partitions, including a discussion of the Hardy- 
Rademacher-Ramanujan formula 2.49, may be found in [5]. There is a vast literature on 
pattern-avoiding permutations; for more information on this topic, consult Bona [15]. 

A great many combinatorial interpretations have been discovered for the Catalan num- 
bers C;,. A partial list appears in Exercise 6.19 of Stanley [127, Vol. 2]; this list continues in 
the “Catalan addendum,” which currently resides on the Internet at the following location: 


http: //www-math.mit.edu/~rstan/ec/catadd. pdf 
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Counting Problems in Graph Theory 


Graph theory is a branch of discrete mathematics that studies networks composed of a 
number of sites (vertices) linked together by connecting arcs (edges). This chapter studies 
some enumeration problems that arise in graph theory. We begin by defining fundamental 
graph-theoretic concepts such as walks, paths, cycles, vertex degrees, connectivity, forests, 
and trees. This will lead to a discussion of various enumeration problems involving different 
kinds of trees. Aided by ideas from matrix theory, we will count walks in a graph, spanning 
trees of a graph, and Eulerian tours. We also investigate the chromatic polynomial of a 
graph, which counts the number of ways of coloring the vertices such that no two vertices 
joined by an edge receive the same color. 


ee 


3.1 Graphs and Digraphs 


Intuitively, a graph is a mathematical model for a network consisting of a collection of nodes 
and connections that link certain pairs of nodes. For example, the nodes could be cities 
and the connections could be roads between cities. For another example, the nodes could 
be computers and the connections could be network links between computers. For a third 
example, the nodes could be species in an ecosystem and the connections could be predator- 
prey relationships between species. Or the nodes could be tasks and the connections could 
be dependencies among the tasks. There is an unlimited variety of such applications that 
lead naturally to graph models. We now give the formal mathematical definitions underlying 
such models. 


3.1. Definition: Graphs. A graph is an ordered triple G = (V, E,€), where: V = V(G) is 
a finite, nonempty set called the vertex set of G; FE = E(G) is a finite set called the edge set 
of G; and «: E > P(V) is a function called the endpoint function such that, for all e € E, 
e(e) is either a one-element subset of V or a two-element subset of V. If e(e) = {uv}, we call 
the edge e a loop at vertex v. If e(e) = {v, w}, we call v and w the endpoints of e and say 
that e is an edge from v to w. We also say that v and w are adjacent in G, v and w are 
joined by e, and e is incident to v and w. 


We visualize a graph G = (V, E, ) by drawing a collection of dots labeled by the elements 
uv € V. For each edge e € E with e(e) = {v,w}, we draw a line or curved arc labeled e 
between the two dots labeled v and w. Similarly, if e(e) = {v}, we draw a loop labeled e 
based at the dot labeled v. 


3.2. Example. The left-hand drawing in Figure 3.1 represents the graph defined formally 
by the ordered triple 


G1 _ (1, 2, 3,4, 5}, {a, b, ¢, d, €, f, g; Wet pnb) 
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% 1 
6 2 
5 3 


FIGURE 3.1 
A graph, a simple graph, and a digraph. 


where «€ acts as follows: 


at> {1,4}, brs {4,3}, cr {2,3}, dr {1,2}, er {1,2}, 
fr {3}, gr {2,5}, hr {4,5}, ir {4,5}. 


Edge f is a loop at vertex 3; edges h and 7 both go between vertices 4 and 5; vertices 1 and 
4 are adjacent, but vertices 2 and 4 are not. 


In many applications, there are no loop edges, and there is never more than one edge 
between the same two vertices. This means that the endpoint function ¢€ is a one-to-one 
map into the set of two-element subsets of V. So we can identify each edge e with its set of 
endpoints ¢€(e). This leads to the following simplified model in which edges are not explicitly 
named and there is no explicit endpoint function. 


3.3. Definition: Simple Graphs. A simple graph is a pair G = (V,E), where V is a 
finite nonempty set and E is a set of two-element subsets of V. We continue to use all the 
terminology introduced in 3.1. 


3.4. Example. The central drawing in Figure 3.1 depicts the simple graph G2 with vertex 
set V(G2) = {0,1, 2,3, 4,5,6} and edge set 


E(G2) = {{1, 2}, {2,3}, (3, 4}, {4 5}, {5, 6}, {1 6}, {2, 6}, (1, 4h}. 


To model certain situations (such as predator-prey relationships, or one-way streets in 
a city), we need to introduce a direction on each edge. This leads to the notion of a digraph 
(directed graph). 


3.5. Definition: Digraphs. A digraph is an ordered triple D = (V, E,¢€), where V is a 
finite nonempty set of vertices, E is a finite set of edges, and e: EF — V x V is the endpoint 
function. If e(e) is the ordered pair (v,w), we say that e is an edge from v to w. 


In a digraph, an edge from v to w is not an edge from w to v when v ¥ w, since 
(v,w) 4 (w,v). On the other hand, in a graph, an edge from v to w is also an edge from w 
to uv, since {v, w} = {w, v}. 


3.6. Example. The right-hand drawing in Figure 3.1 displays a typical digraph. In this 
digraph, ¢(j) = (4,5), e(a) = (1,1), and so on. There are three edges from 2 to 3, but no 
edges from 3 to 2. There are edges in both directions between vertices 1 and 5. 
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As before, we can eliminate specific reference to the endpoint function of a digraph if 
there are no multiple edges with the same starting vertex and ending vertex. 


3.7. Definition: Simple Digraphs. A simple digraph is an ordered pair D = (V, E), 
where V is a finite, nonempty set and EF is a subset of V x V. Each ordered pair (v, w) € E 
represents an edge in D from v to w. Note that we do allow loops (v = w) in a simple 
digraph. 


When investigating structural properties of graphs, the names of the vertices and edges 
are often irrelevant. The concept of graph isomorphism lets us identify graphs that are “the 
same” except for the names used for the vertices and edges. 


3.8. Definition: Graph Isomorphism. Given two graphs G = (V,E,¢) and H = 
(W,F,7), a graph isomorphism from G to H consists of two bijections f : V — W and 
g : E — F such that, for all e € E, if e(e) = {v,w} then n(g(e)) = {f(v), f(w)} (we 
allow v = w here). Digraph isomorphisms are defined similarly: e(e) = (v,w) implies 
n(g(e)) = (f(v), f(w)). We say G and H are isomorphic, written G = H, iff there exists a 
graph isomorphism from G to H. 


In the case of simple graphs G = (V, E) and H = (W,F), a graph isomorphism can be 
viewed as a bijection f : V — W that induces a bijection between the edge sets E and F. 
More specifically, this means that for all v,w € V, {v,w} © E iff {f(v), f(w)} € F. 


SST 


3.2. Walks and Matrices 


We can travel through a graph by following a succession of edges. Formalizing this idea 
leads to the concept of a walk. 


3.9. Definition: Walks, Paths, Cycles. Let G = (V, E,«) be a graph or digraph. A walk 
in G is a sequence 
We (vo, €1, U1, €2, V2,---,€s, 05) 


where s > 0, vu; € V for all 2, e; € F for all 7, and e; is an edge from v;_; to v; for 1 <i<s. 
We say that W is a walk of length s from vo to vs. The walk W is closed iff v9 = vs. The 
walk W is a path iff the vertices vp, v1,...,Us are pairwise distinct (which forces the edges 
e; to be distinct as well). The walk W is a cycle iff s > 0, v1,...,Us are distinct, e1,...,€s 
are distinct, and vp = vs. A k-cycle is a cycle of length k. In the case of simple graphs and 
simple digraphs, the edges e; are determined uniquely by their endpoints. So, in the simple 
case, we can regard a walk as a sequence of vertices (vo, v1,...,Us) such that there is an 
edge from v;_1 to v; in G for 1 <i<-s. 


3.10. Remark. When considering cycles in a digraph, we usually identify two cycles that 
are cyclic shifts of one another (unless we need to keep track of the starting vertex of the 
cycle). Similarly, we identify cycles in a graph that are cyclic shifts or reversals of one 
another. 


3.11. Example. In the graph G, from Figure 3.1, 
Wi = (2, ¢,3, f,3, 0,4, h, 5, 7, 4, 7, 5) 


is a walk of length 6 from vertex 2 to vertex 5. In the simple graph G2 in the same figure, 
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FIGURE 3.2 
Digraph used to illustrate adjacency matrices. 


W.2 = (1,6,2,3,4,5) is a walk and a path of length 5, whereas C = (6,5,4,3,2,6) is a 
5-cycle. We usually identify C with the cycles (5,4,3,2,6,5), (6,2,3,4,5,6), etc. In the 
digraph Gs, 
W3 = (1,4,1,9,5,h, 2,m,4, 9,5, h, 2, d, 3) 

is a walk from vertex 1 to vertex 3; (5,h,2,m,4, 7,5) is a 3-cycle; (4,1,4) is a 1-cycle; and 
(5, f,1,9,5) is a 2-cycle. Observe that 1-cycles are the same as loop edges, and 2-cycles 
cannot exist in simple graphs or simple digraphs. For any vertex v in a graph or digraph, 
(v) is a walk of length zero from v to v, which is a path but not a cycle. 


We can now formulate our first counting problem: how many walks of a given length 

are there between two given vertices in a graph or digraph? We will develop an algebraic 
solution to this problem in which concatenation of walks is modeled by multiplication of 
suitable matrices. 
3.12. Definition: Adjacency Matrix. Let G be a graph or digraph with vertex set 
X = {a,:1<i <n}. The adjacency matrix of G (relative to the given indexing of the 
vertices) is the n x n matrix A whose i, j-entry A(i, 7) is the number of edges in G from 2; 
to vj. 


3.13. Example. The adjacency matrix for the digraph G in Figure 3.2 is 


1 01 0 0 0 
000 1 1 0 
000 0 1 0 
Big 0s Os Ae 
0 10 2 0 0 
000 0 0 2 


3.14. Example. If G is a graph, edges from v to w are the same as edges from w to v. 
So, the adjacency matrix for G is a symmetric matrix (A(i,7) = A(j,#) for all 7,7). If G 
is a simple graph, the adjacency matrix consists of all 1’s and 0’s with zeroes on the main 
diagonal. For example, the adjacency matrix of the simple graph in Figure 3.3 is 


00011 1 


RePrRrH OO 
FPrPrFOoOCO 
Ae eS Ae SS 
CoCo OrRF 
oO COrFRF 
ooo KF Fe 
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1 4 
2 5 
3 6 


FIGURE 3.3 
A simple graph. 


Recall from linear algebra the definition of the product of two matrices. 


3.15. Definition: Matrix Multiplication. Suppose A is an m x n matrix and B is an 
n X p matrix. Then AB is the m x p matrix whose i, j-entry is 


— arc k) B(k, 9) (l<i<m,1l<j<p). (3.1) 
k=1 


Matrix multiplication is associative, so we can write a product of three or more (compat- 
ible) matrices without any parentheses. The next theorem gives a formula for the general 
entry in such a product. 


3.16. Theorem: Product of Several Matrices. Assume A;,...,A,; are matrices such 
that A; has dimensions n;_1 x n;. Then A; Ag--- A, is the no X ng matrix whose ko, k,-entry 
is 
Ns—1 
(Ai Ag+ As) (ko, ks = Su: - S© Ai (ko, ki) Aa (ki, k2)As (ke, ks) +++ As(ks—1, ks) 
kiy=lko=1 ki =1 
(3.2) 
for all 1 < ko <no9,1 < ky < ng. 


Proof. We use induction on s. The case s = 1 is immediate, and the case s = 2 is the 
definition of matrix multiplication (after a change in notation). Assume s > 2 and that 
(3.2) is known to hold for the product B = A,Ag---As—1. We can think of the given 
product A, A2---As—-1A,; as the binary product BA,. Therefore, using (3.1), the ko, ks- 
entry of A, Ag--- A, is 


Ns-1 
k=1 
Ns—1 Ns—2 

= Se Se > Ai( (ko, ky) )Ao(k1, kg) +: »Ag_ i(ks— 2,k) Ag(k, ks) 
k=1 \ki=1 9 ke-2=1 
ny Ns—2 MNs—-1 

= Soe SD YS Ailko, 1) Aa(k1, ka) ++ Asa (kis—2, K) Ag, Bs). 
ki=1 — kg-a=1 k=1 

Replacing k by k,_1 in the innermost summation, we obtain the result. O 


Taking all A;,’s in the theorem to be the same matrix A, we obtain the following formula 
for the entries of the powers of a given square matrix. 
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3.17. Corollary: Powers of a Matrix. Suppose A is an n x n matrix. For each integer 
s>0, 


AX(i,j) = Doo SD Ali et) A(ha, ho) +++ (sa, sa) A(hs-1,9) (1S 49 <7). 
ky=1 ks—-1=1 
(3.3) 


The preceding formula may appear unwieldy. But it is precisely the tool we need to 
count walks in graphs. 


3.18. Theorem: Enumeration of Walks. Let G be a graph or digraph with vertex set 
X = {x1,...,X%n}, and let A be the adjacency matrix of G. For all 1,7 <n and all s > 0, 
the i, j-entry of A® is the number of walks of length s in G from 2; to ;. 


Proof. The result holds for s = 0, since A° = I, (the n x n identity matrix) and there is 
exactly one walk of length zero from any vertex to itself. Now suppose s > 0. A walk of 
length s from x; to x; will visit s — 1 intermediate vertices (not necessarily distinct from 
each other or from x; or x;). Let (@;,@4,,-.-,Uk,_,, 7) be the ordered list of vertices visited 
by the walk. To build such a walk, we choose any edge from x; to xp, in A(i, ki) ways; then 
we choose any edge from 2x, to xp, in A(k,k2) ways; and so on. By the product rule, the 
total number of walks associated to this vertex sequence is A(t, ky) A(k1, k2)+-- A(Ks—1, 7). 
This formula holds even if there are no walks with this vertex sequence, since some term 
in the product will be zero in this case. Applying the sum rule produces the right side of 
(3.3), and the result follows. O 


3.19. Example. Consider again the adjacency matrix A of the digraph G in Figure 3.2. 
Some matrix computations show that 


ee | [ie ee Sl ee E78) 
it MG SB 1 0 0163 60 
pate, (OF OER (Oh.0 as_| 2 9 6 1 3 0 
GO) $0.2 30 | 2 0. Be SF Ba 
00613 0 O: Bs 3.36: PG 
00000 4 00000 8 


So, for example, there are six walks of length 2 from vertex 5 to vertex 3, and there are 
seven walks of length 3 that start and end at vertex 4. 


DT 


3.3. DAGs and Nilpotent Matrices 


Next we consider the question of counting all walks (of any length) between two vertices 
in a digraph. The question is uninteresting for graphs, since the number of walks between 
two distinct vertices v,w in a graph is either zero or infinity. This follows since a walk is 
allowed to repeatedly traverse a particular edge along a path from v to w, which leads to 
arbitrarily long walks from v to w. Similarly, if G is a digraph that contains a cycle, we 
obtain arbitrarily long walks between two vertices on the cycle by going around the cycle 
again and again. To rule out these possibilities, we restrict attention to the following class 
of digraphs. 


3.20. Definition: DAGs. A DAG is a digraph with no cycles. (The acronym DAG stands 
for “directed acyclic graph.” ) 
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To characterize adjacency matrices of DAGs, we need another concept from matrix 
theory. 


3.21. Definition: Nilpotent Matrices. An n x n matrix A is called nilpotent iff A®* = 0 
for some integer s > 1. The least such integer s is called the index of nilpotence of A. 


Note that if A* = 0 and t > s, then A’ = 0 also. 


3.22. Example. The zero matrix is the unique n x n matrix with index of nilpotence 1. The 


square of the nonzero matrix A = | : : is zero, so A is nilpotent of index 2. Similarly, 
any matrix 
O a y 
B=|0 0 2z (x,y,z € R) 
0 0 0 
satisfies 
0 0 «xz 
B?=|0 0 0 |, Be =0, 
0 0 O 


so we obtain examples of matrices that are nilpotent of index 3. The next result generalizes 
this example. 


3.23. Theorem: Nilpotence of Strictly Triangular Matrices. Suppose A is an n x n 
strictly upper-triangular matrix, which means A(i, 7) = 0 for all i > 7. Then A is nilpotent 
of index at most n. A similar result holds for strictly lower-triangular matrices. 


Proof. It suffices to show that A” is the zero matrix. By (3.3), we have 


" (ko, Kn oe oS aageen (k1, ko) +++ A(Kn—1, Fn) 


ky=1 kn—-1=1 


for all ko, kn <n. We claim that each term in this sum is zero. Otherwise, there would exist 
ky,...,kn—1 such that A(ky_1, ky) 4 0 for 1 < ¢ < n. But since A is strictly upper-triangular, 
we would then have 

ko < ky < ko < +++ < ky. 


This cannot occur, since all the k;’s are integers between 1 and n. O 
The next theorem reveals the connection between nilpotent matrices and DAGs. 


3.24. Theorem: DAGs and Nilpotent Matrices. Let G be a digraph with vertex set 
X = {21,...,2%,} and adjacency matrix A. G is a DAG iff A is nilpotent. When G is a 
DAG, there exists an ordering of the vertex set X that makes A a strictly lower-triangular 
matrix. 


Proof. Assume first that G is not a DAG, so that G has at least one cycle. Let x; be any 
fixed vertex involved in this cycle, and let c > 1 be the length of this cycle. By going 
around the cycle zero or more times, we obtain walks from x; to x; of lengths 0, c, 2c, 3c,.... 
By 3.18, it follows that the (7, i)-entry of A** is at least 1, for all k > 0. This fact prevents 
any positive power of A from being the zero matrix, so A is not nilpotent. 

Conversely, assume that A is not nilpotent. Then, in particular, A” 4 0, so there exist 
indices ko, kn with A"(ko, kn) 4 0. Using 3.18 again, we deduce that there is a walk in G 
visiting a sequence of vertices 


Cie hie: srs tee 5h Sha, 
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Since G has only n vertices, not all of the n+ 1 vertices just listed are distinct. If we choose 
7 minimal and then j > 7 minimal such that x, = x;,, then there is a cycle in G visiting 
the vertices (@%,,2ki41)+++,Tk;). So G is not a DAG. 

We prove the statement about lower-triangular matrices by induction on n. A one-vertex 
DAG must have adjacency matrix (0), so the result holds for n = 1. Suppose n > 1 and 
the result is known for DAGs with n — 1 vertices. Create a walk (vo, €1, U1, €2, V2,...) in G 
by starting at any vertex and repeatedly following any edge leading away from the current 
vertex. Since G has no cycles, the vertices vj reached by this walk are pairwise distinct. 
Since there are only n available vertices, our walk must terminate at a vertex vj; with no 
outgoing edges. Let xi = v,;. Deleting x‘, and all edges leading into this vertex will produce 
an (n — 1)-vertex digraph G’ that is also a DAG, as one immediately verifies. By induction, 
there is an ordering x},...,x/, of the vertices of G’ such that the associated adjacency 
matrix A’ is strictly lower-triangular. Now, relative to the ordering x},74,...,x/, of the 
vertices of G, the adjacency matrix of G has the form 


and this matrix is strictly lower-triangular. O 


The next result will allow us to count walks of any length in a DAG. 


3.25. Theorem: Inverse of J — A for Nilpotent A. Suppose A is a nilpotent n x n 
matrix with A* = 0. Let I be the n x n identity matrix. Then J — A is an invertible matrix 
with inverse 

(I-A) '=I+A+A?+A24---4 ASH, 


Proof. Let B= I+ A+ A?+---+ A’~!. By the distributive law, 
(I— A)B=IB—AB=(I+A+A?+4---4+ A®1)—(A4 A? 4.---4 A914 A*%) =I - AP, 


Since A® = 0, we see that (J — A)B = I. A similar calculation shows that B(I — A) = I. 
Therefore B is the two-sided matrix inverse of [ — A. O 


3.26. Remark. The previous result for matrices can be remembered by noting the analogy 
to the geometric series formula for real numbers: 
1 2 


(ing) Sea laste epee eae (lel ed), 


3.27. Theorem: Counting Paths in a DAG. Let G be a DAG with vertex set 
{x1,...,%,} and adjacency matrix A. For all i,j < n, the total number of paths from 
x; to x; in G (of any length) is the i, j-entry of (I — A)7}. 


Proof. By 3.18, the number of walks of length t > 0 from 2; to 2; is A‘(i,7j). Because G is 
a DAG, we have At = 0 for all t > n. By the sum rule, the total number of walks from 2; 
to a, is YP) At(i, 7). By 3.25, this number is precisely the i, j-entry of (I — A)~!. Finally, 
one readily confirms that every walk in a DAG must actually be a path. O 


3.28. Example. Consider the DAG shown in Figure 3.4. Its adjacency matrix is 


Counting Problems in Graph Theory 105 


FIGURE 3.4 
Example of a DAG. 


[e 011001 aH 
0001 1 1 0 0 
0000001 1 
aes 000001 1 +0 
000 00 2 0 0 
0000002 1 
0000000 1 
000 0 0 0 0 0 
Using a computer algebra system, we compute 
1011015 #7 
0101 1 4 9 18 
001000 1 2 
-1 000101 3 4 
Poa = 19 6.0.0 1 2 46 
000001 2 8 
0000001 1 
000000 0 1 


So, for example, there are 13 paths from vertex 2 to vertex 8, and 4 paths from vertex 5 to 
vertex 7. 


3.4 Vertex Degrees 
In many situations, one needs to know how many edges lead into or out of each vertex in a 
digraph. 


3.29. Definition: Indegree and Outdegree. Let G = (V,E,«) be a digraph. For each 
vu € V, the outdegree of v, denoted outdega(v), is the number of edges e € E from v; the 
indegree of v, denoted indega(v), is the number of edges e € E to v. Formally, 


outdegg(v) = J> S> x(e(e) = (vw); indegg(v) = S> SZ v(cCe) = (w, v)). 


weV ecE weV ec€E 
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A source is a vertex of indegree zero. A sink is a vertex of outdegree zero. 


3.30. Example. Let G3 be the digraph on the right in Figure 3.1. We have 


(indegg, (1), . a ,indegg,(7)) = (2,2, 4,2, 2,0, 1); 
outdegg,(1),..., outdega, (7 3, 4,0, 3, 2,1, 0). 
G3 G3 


Vertex 6 is the only source, whereas vertices 3 and 7 are sinks. A loop edge at v contributes 
1 to both the indegree and outdegree of v. The sum of all the indegrees is 13, which is 
also the sum of all the outdegrees, and is also the number of edges in the digraph. This 
phenomenon is explained in the next theorem. 


3.31. Theorem: Degree-Sum Formula for Digraphs. In any digraph G = (V, F,€), 


S- indegg(v) = |E| = S- outdega(v 


vEV vEV 


Proof. By the formal definition of indegree, we have 


= indegg(v) = = S- YS x(e(e) = (w, v)) 


vEeV veEV weV eE€E 


¥ (EE w= con) 


e€E \weV veEV 


For each e € FE, the term in brackets is equal to one for exactly one ordered pair (w,v), and 
is zero otherwise. So the sum evaluates to }).¢, 1 = |£|. The formula involving outdegree 
is proved similarly. O 


Next we give analogous definitions and results for graphs. 


3.32. Definition: Degree. Let G = (V, E,e) be a graph. For each v € V, the degree of v 
in G, denoted dega(v), is the number of edges in E with v as an endpoint, where each loop 
edge at v is counted twice. Formally, 


dege(v) = >) 2x(e(e) = {v}) + D5 DE x(ele) = {v, w}). 


ecE weV ecE 
wHv 


The degree sequence of G, denoted deg(G), is the multiset [degg(v) : v € V]. G is called 
k-regular iff every vertex in G has degree k. G is regular iff G is k-regular for some k > 0. 


3.33. Example. For the graph G; in Figure 3.1, we have 
(degg(1),...,dege(5)) = (3,4,4,4,3);  deg(G@) = [4, 4, 4,3, 3]. 


The graph in Figure 3.3 is 3-regular. In both of these graphs, the sum of all vertex degrees 
is 18, which is twice the number of edges in the graph. 


3.34. Theorem: Degree-Sum Formula for Graphs. For any graph G = (V, E,€), 


S- dega(v) = 2|E}. 


vEV 
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Proof. First assume G has no loop edges. Let X be the set of pairs (v,e) such that v € V, 
e € £, and v is an endpoint of e. On one hand, 


|X| = $7 SO x((v,e) € X) = YF dege (a). 


vEV eCcE vEV 


On the other hand, 


IX1= S70 So x((v,e) € X) = $32 = QI 


e€ EB vEeV ecH 


since every edge has two distinct endpoints. So the result holds in this case. 
Next, if G has k loop edges, let G’ be G with these loops deleted. Then 


Ss degg(v) = ae degg,(v) + 2k, 


vEV vEeV 
since each loop edge increases some vertex degree in the sum by 2. Using the result for 
loopless graphs, 


ay degg(v) = 2|E(G")| + 2k = 2|E(G)|, 
vEeV 


since G has k more edges than G’. oO 
Vertices of low degree are given special names. 


3.35. Definition: Isolated Vertices. An isolated vertex in a graph is a vertex of degree 
zero. 


3.36. Definition: Leaves. A leaf is a vertex of degree one. 
The following result will be used later in our analysis of trees. 


3.37. Two-Leaf Lemma. Suppose G is a graph. One of the following three alternatives 
must occur: (i) G has a cycle; (ii) G has no edges; or (iii) G has at least two leaves. 


Proof. Suppose that G has no cycles and G has at least one edge; we prove that G has two 
leaves. Since G has no cycles, we can assume G is simple. Let P = (vo, v1,..., Us) be a path 
of maximum length in G. Such a path exists, since G has only finitely many vertices and 
edges. Observe that s > 0 since G has an edge, and vp 4 vs. Note that deg(v;) > 1 since 
s > 0. Assume v, is not a leaf. Then there exists a vertex w # vs_; that is adjacent to vs. 


Now, w is different from all v; with 0 < 7 < s—1, since otherwise (vj, vj41,...,Us,w = U,) 
would be a cycle in G. But this means (vo, v1,...,Us,w) is a path in G longer than P, 
contradicting maximality of P. So v, must be a leaf. A similar argument shows that vo is 
also a leaf. O 


DS 


3.5 Functional Digraphs 


We can obtain structural information about functions f : V — V by viewing these functions 
as certain kinds of digraphs. 


3.38. Definition: Functional Digraphs. A functional digraph on V is a simple digraph 
G with vertex set V such that outdego(v) = 1 for all v € V. 
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FIGURE 3.5 
A functional digraph. 


A function f : V — V can be thought of as a set Ey of ordered pairs such that for each 
x € V, there exists exactly one y € V with (z,y) € Ey, namely y = f(x). Then (V, Ey) isa 
functional digraph on V. Conversely, a functional digraph G = (V, F) determines a unique 
function g: V > V by letting g(v) be the other endpoint of the unique edge in E departing 
from v. These comments establish a bijection between the set of functions on V and the set 
of functional digraphs on V. 


3.39. Example. Figure 3.5 displays the functional digraph associated to the following 
function: 


fd)=15; f(2)=16; f(3)=8 f(4)=17; f(5) =5; 
f(6)=5, f()=4 f(8)=3; f9)=6 fl0)=4; 
f(11)=10; f(12)=4; f(13)=10; f(14)=1; f(15) = 12; 
f(16)=1; (17) =15. 


We wish to understand the structure of functional digraphs. Consider the digraph G = 
(V, £) shown in Figure 3.5. Some of the vertices in this digraph are involved in cycles, which 
are drawn at the bottom of the figure. These cycles have length one or greater, and any two 
distinct cycles involve disjoint sets of vertices. The other vertices in the digraph all feed into 
these cycles at different points. We can form a set partition of the vertex set of the digraph 
by collecting together all vertices that feed into a particular vertex on a particular cycle. 
Each such collection can be viewed as a smaller digraph that has no cycles. We will show 
that these observations hold for all functional digraphs. To do this, we need a few more 
defintions. 


3.40. Definition: Cyclic Vertices. Let G be a functional digraph on V. A vertex v € V 
is called cyclic iff v belongs to some cycle of G; otherwise, v is called acyclic. 


3.41. Example. The cyclic elements for the functional digraph in Figure 3.5 are 3, 4, 5, 
8, 12, 15, and 17. 


Let f : V — V be the function associated to the functional digraph G. Then v € V 
is cyclic iff f*(v) = v for some i > 1, where f* denotes the composition of f with itself 
i times. This fact follows since the only possible cycle involving v in G must look like 


(v f(r), F?(e), Pe), ..). 


3.42. Definition: Rooted Trees. A digraph G is called a rooted tree with root vo iff G is 
a functional digraph and vp is the unique cyclic vertex of G. 
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3.43. Theorem: Structure of Functional Digraphs. Let G be a functional digraph on 
V with associated function f :V — V. Let C C V denote the set of cyclic vertices of G. C’ 
is nonempty, and each v € C belongs to exactly one cycle of G. Also, there exists a unique 
indexed set partition {S, : v € C} of V such that the following hold for all v € C: (i) v € Sy; 
(ii) 2 € S, and « 4 v implies f(x) € S,; (iii) if g: S, — S, is defined by g(x) = f(x) for 
x #v, g(v) =v, then the functional digraph of g is a rooted tree with root v. 


Proof. First, suppose v € C. Since every vertex of G has exactly one outgoing edge, the 
only possible cycle involving v must be (v, f(v), f?(v),-.-, f’(v) = v). So each cyclic vertex 
(if any) belongs to a unique cycle of G. This implies that distinct cycles of G involve disjoint 
sets of vertices and edges. 

Next we define a surjection r : V — C. The existence of r will show that C 4 0, since 
V #9. Fix u € V. By repeatedly following outgoing arrows, we obtain for each k > 0a 
unique walk (u = uo, U1, U2,..-,Ux) in G of length k. Since V is finite, there must exist i < j 
with u; = u,;. Take i minimal and then j minimal with this property; then (u;, wi41,...,U;) 
is a cycle in G. We define r(u) = u;, which is the first element on this cycle reached from wu. 
One may check that r(u) = u for all u € C; this implies that r is surjective. On the other 
hand, if u ¢ C, the definition of r shows that r(u) = r(u1) = r(f(u)). 

How shall we construct a set partition with the stated properties? For each v € C, 
consider the “fiber” S, = r~t({v}) = {w € V: r(w) = v}; then {S, : v € C} is a set 
partition of V indexed by C. The remarks at the end of the last paragraph show that this 
set partition satisfies (i) and (ii). To check (iii) for some v € C, first note that the map g 
defined in (iii) does map S,, into S, by (i) and (ii). Suppose W = (wo, wi,..., wx) is a cycle 
in the functional digraph for g. Since r(wo) = v, we will eventually reach v by following 
outgoing arrows starting at wo. On the other hand, following these arrows keeps us on the 
cycle W, so some w; = v. Since g(v) = v, the only possibility is that W is the L-cycle (v). 
Thus (iii) holds for each v € C. 

To see that {S, : v € C} is unique, let P = {T, : v € C} be another set partition 
with properties (i), (ii), and (iii). It is enough to show that S, C T, for each v € C. Fix 
v € C and z € Sy. By (ii), every element in the sequence (z, f(z), f?(z),...) belongs to the 
same set of P, say Ty. Then v = r(z) = f*(z) € Tw, so (i) forces w = v. Thus z € T, as 
desired. O 


We can informally summarize the previous result by saying that every functional digraph 
uniquely decomposes into disjoint rooted trees feeding into one or more disjoint cycles. There 
are two extreme cases of this decomposition that are especially interesting — the case where 
there are no trees (i.e., C = V), and the case where the whole digraph is a rooted tree (i.e., 
|C| = 1). We study these types of functional digraphs in the next two sections. 


mm IIN i 
3.6 Cycle Structure of Permutations 

The functional digraph of a bijection (permutation) has a particularly nice structure. 
3.44. Example. Figure 3.6 displays the digraph associated to the following bijection: 


A(1)=7; h(2)=8; h(3)=4; h(4) =3; A(5) = 10; 
h(6)=2; h(7)=5; h(8)=6; h(9)=9; A(10) =1. 


We see that the digraph for h contains only cycles; there are no trees feeding into these 
cycles. To see why this happens, compare this digraph to the digraph for the non-bijective 
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FIGURE 3.6 
Digraph associated to a permutation. 


function f in Figure 3.5. The digraph for f has a rooted tree feeding into the cyclic vertex 
15. Accordingly, f is not injective since f(17) = 15 = f(1). Similarly, if we move backwards 
through the trees in the digraph of f, we reach vertices with indegree zero (namely 2, 7, 9, 
11, 13, and 14). The existence of such vertices shows that f is not surjective. Returning to 
the digraph of h, consider what happens if we reverse the direction of all the edges in the 
digraph. We obtain another functional digraph corresponding to the following function: 


h’'(1)=10; h’(2)=6; h’(3)=4; h’(4)=3; A’(5) =7; 
h’(6)=8; h’'(7)=1; A’(8)=2; h’(9)=9; h’(10) =5. 


One sees immediately that h’ is the two-sided inverse for h. 
The next theorem explains the observations in the last example. 


3.45. Theorem: Cycle Decomposition of Permutations. Let f : V — V bea function 
with functional digraph G. The map f is a bijection iff every v € V is a cyclic vertex in V. 
In this situation, G is a disjoint union of cycles. 


Proof. Suppose u € V is a non-cyclic vertex. By 3.43, u belongs to a rooted tree S,, whose 
root v belongs to a cycle of G. Following edges outward from u will eventually lead to v; 
let y be the vertex in S,, just before v on this path. Let z be the vertex just before v in the 
unique cycle involving v. We have y # z, but f(y) =v = f(z). Thus, f is not injective. 
Conversely, suppose all vertices in V are cyclic. Then the digraph G is a disjoint union of 
directed cycles. So every v € V has indegree 1 as well as outdegree 1. Reversing the direction 
of every edge in G therefore produces another functional digraph G’. Let f’ : V > V be 
the function associated to this new digraph. For all a,b € V, we have b = f(a) iff (a,b) EG 
iff (b,a) € G’ iff a = f’(b). It follows that f’ is the two-sided inverse for f, so that f and f’ 
are both bijections. O 


Recall that S(n, &), the Stirling number of the second kind, is the number of set partitions 
of an n-element set into & blocks. Let c(n,&) be the number of permutations of an n-element 
set whose functional digraph consists of k disjoint cycles. We will show that c(n,k) = 
s'(n,k), the signless Stirling number of the first kind. Recall from 2.66 that the numbers 
s'(n, k) satisfy the recursion s’(n,k) = s‘(n-—1,k —1)+ (n—-1)s'(n—-1,k) forO << k <n, 
with initial conditions s’(n,0) = x(n = 0) and s’(n,n) = 1. 


3.46. Theorem: Recursion for c(n,k). We have c(n,0) = y(n = 0) and c(n,n) = 1 for 
alln > 0. For0< k <n, we have 


c(n,k) = c(n —1,k — 1) + (n— 1)c(n —-1,k). 
Therefore, c(n,k) = s'(n,k) for lO <k <n. 


Proof. The identity map is the unique permutation of an n-element set with n cycles (which 
must each have length 1), so c(n,n) = 1. The only permutation with zero cycles is the empty 
function on the empty set, so c(n,0) = x(n = 0). Now suppose 0 < k < n. Let A, B,C be 
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the sets of permutations counted by c(n,k), c(n — 1,k — 1), and c(n — 1,k), respectively. 
Note that A is the disjoint union of the two sets 


A, ={f EA: f(n) =n} and A,={f eA: f(n) Fn}. 


For each f € Aj, we can restrict f to the domain {1,2,...,2—1} to obtain a permutation 
of these n — 1 elements. Since f has k cycles, one of which involves n alone, the restriction 
of f must have k — 1 cycles. Since f € Aj, is uniquely determined by its restriction to 
{1,2,...,2— 1}, we have a bijection from A, onto B. 

On the other hand, let us build a typical element f € Aj by making two choices. 
First, choose a permutation g € C in c(n — 1,k) ways. Second, choose an element i € 
{1,2,...,2—1} in n—1 ways. Let j be the unique number such that g(j) = 7. Modify the 
digraph for g by removing the arrow from j to 7 and replacing it by an arrow from j to n 
and an arrow from n to 7. Informally, we are splicing n into the cycle just before 7. Let f 
be the permutation associated to the new digraph. Evidently, the splicing process does not 
change the number of cycles of g, and f satisfies f(n) 4 n. Thus, f € Ag, and every element 
of Ag arises uniquely by the choice process we have described. By the sum and product 
rules, 

c(n, k) = |A| = |A,| + |Ao| = c(n — 1,4 — 1) + (n—- 1)c(n — 1,k). 


So c(n,k) and s’(n,k) satisfy the same recursion and initial conditions. A routine induction 
argument now shows that c(n,k) = s’(n,k) for all n and k. Oo 


DT 


3.7 Counting Rooted Trees 
Our goal in this section is to count rooted trees (see 3.42) with a fixed root vertex. 


3.47. Theorem: Enumeration of Rooted Trees. For all n > 1, there are n”~? rooted 
trees on the vertex set {1,2,...,n} with root 1. 


Proof. Let B be the set of rooted trees mentioned in the theorem. Let A be the set of all 
functions f : {1,2,...,n}— {1,2,...,n} such that f(1) = 1 and f(n) = n. The product rule 
shows that |A| = n”"~?. It therefore suffices to define maps ¢: A > B and ¢’: B — A that 
are mutual inverses. To define @, fix f € A. Let Gy = ({1,2,...,n},{(¢, f@) :1<i<n}) 
be the functional digraph associated with f. By 3.43, we can decompose the vertex set 
{1,2,...,n} of this digraph into some disjoint cycles Co, C1,...,C, and (possibly) some 
trees feeding into these cycles. For 0 <i < k, let ¢; be the largest vertex in cycle C;, and 
write C; = (r;,...,€;). We can choose the indexing of the cycles so that the numbers ¢; 
satisfy lp > 0; > lg >--- > &. Since f(1) = 1 and f(n) =n, 1 and n belong to cycles of 
length 1, so that €) = 19 =n, Co = (n), €h =rx =1, Cy = (1), and k > 0. To obtain (f), 
modify the digraph Gy by removing all edges of the form (¢;,r;) and adding new edges 
(€i,Ti41), for 0<i<k. One may check that ¢(f) is always a rooted tree with root 1. 


3.48. Example. Suppose n = 20 and f is the function defined as follows: 


fa)=1; f(2)=19; f(3)=8; f(4)=17; f(b) =5; 
f(6)=5; f(=4;  f(8)=3; f(9)=6; f0)=1; 
f(l1)=18; f(2)=4; f(13)=18; f(14)=20; f(15) = 15; 
f6)=1; f7)=12; f8)=4; f(19)=20; f(20) = 20. 


We draw the digraph of f in such a way that all vertices involved in cycles occur in a hori- 
zontal row at the bottom of the figure, and the largest element in each cycle is the rightmost 
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FIGURE 3.7 
A functional digraph with cycles arranged in canonical order. 
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FIGURE 3.8 
Conversion of the digraph to a rooted tree. 


element of its cycle. We arrange these cycles so that these largest elements decrease from 
left to right; in particular, vertex n is always at the far left, and vertex 1 at the far right. 
See Figure 3.7. To compute ¢(f), we cut the “back-edges” leading left from ¢; to r; (which 
are loops if ; = r;) and add new edges leading right from ¢; to rj41. See Figure 3.8. 


Continuing the proof, let us see why ¢ is invertible. Let T be a rooted tree on {1, 2,...,} 
with root 1. Following outgoing edges from n must eventually lead to the unique cyclic vertex 
1. Let P = (vo, 01,..-,Us) be the vertices encountered on the way from vp = n to vs = 1. 
We recursively recover the numbers @o, £1,...,€% as follows. Let 9 = n. Define @; to be 
the largest number in P following €. In general, after €;_; has been found, define ¢; to be 
the largest number in P following @;_,. After finitely many steps, we will get ¢, = 1 for 
some k. Next, let rg = n, and for 2 > 0, let r; be the vertex immediately following @;_; on 
the path P. Modify T by deleting the edges (;,r;41) and adding edges of the form (¢;,r;), 
for 0 <i < k. One can verify that every vertex in the resulting digraph G’ has outdegree 
exactly 1, and there are loop edges in G’ at vertex 1 and vertex n. Thus, G’ is a functional 
digraph that determines a function f = ¢/(T) € A. It follows from the definition of ¢ that 
g’ is the two-sided inverse of ¢. Oo 


3.49. Example. Suppose n = 9 and T is the rooted tree shown on the left in Figure 3.9. 
We first redraw the picture of T so that the vertices on the path P from n to 1 occur in 
a horizontal row at the bottom of the picture, with n on the left and 1 on the right. We 
recover £; and r; by the procedure above, and then delete the appropriate edges of JT’ and 
add appropriate back-edges to create cycles. The resulting functional digraph appears on 
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FIGURE 3.9 
Conversion of a rooted tree to a functional digraph. 


the bottom right in Figure 3.9. So ¢'(T) is the function g defined as follows: 


g(1) (2) (3) =2; g(4)=9; g(5) =9; 
9. 


=1; 9(2)=2; g 
9(6)=7; g(7) =6; 9(8)=9; gM) = 


DT 
3.8 Connectedness and Components 


In many applications of graphs, it is important to know whether every vertex is reachable 
from every other vertex. 


3.50. Definition: Connectedness. Let G = (V,E,¢) be a graph or digraph. G is con- 
nected iff for all u,v € V, there is a walk in G from u to v. 


3.51. Example. The graph G, in Figure 3.1 is connected, but the simple graph G2 and 
the digraph G3 in that figure are not connected. 


Connectedness can also be described using paths instead of walks. 


3.52. Theorem: Walks vs. Paths. Let G = (V,E,¢) be a graph or digraph, and let 
u,v € V. There is a walk in G from u to v iff there is a path in G from uw to v. 


Proof. Let W = (v9, €1, U1,---;@s;Us) be a walk in G from u to v. We describe an algorithm 
to convert the walk W into a path from u to v. If all the vertices v; are distinct, then the 
edges e; must also be distinct, so W is already a path. Otherwise, choose 7 minimal such 
that v; appears more than once in W, and then choose 7 maximal such that v; = v;. Then 
Wi = (v0, €1, U1,-- +5 Ci, Vis CJ741; Vjt1;+-+,€s, Us) is a walk from u to v of shorter length than 
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FIGURE 3.10 
Converting a walk to a path. 


W. If W, is a path, we are done. Otherwise, we repeat the argument to obtain a walk W2 
from u to v that is shorter than W,. Since the lengths keep decreasing, this process must 
eventually terminate with a path W, from u to v. (W, has length zero if u = v.) The 
converse is immediate, since every path in G from uw to v is a walk in G from uw to v. O 


3.53. Example. In the simple graph shown in Figure 3.10, consider the walk 
W = (11,10,1, 2, 10,3, 4,8, 11,8, 12, 10, 6,9, 7, 13,9, 12, 8,5). 
First, the repetition vp = 11 = vg leads to the walk 
W, = (11,8, 12,10, 6,9, 7, 13,9, 12, 8,5). 
Eliminating the multiple visits to vertex 8 leads to the walk 
W2 = (11,8, 5). 
W, is a path from 11 to 5. 


3.54. Corollary: Connectedness and Paths. A graph or digraph G = (V, E,«¢) is con- 
nected iff for all u,v € V, there is at least one path in G from u to v. 


By looking at pictures of graphs, it becomes visually evident that any graph decomposes 
into a disjoint union of connected pieces, with no edge joining vertices in two separate pieces. 
These pieces are called the (connected) components of the graph. The situation for digraphs 
is more complicated, since there may exist directed edges between different components. To 
give a formal development of these ideas, we introduce the following equivalence relation. 


3.55. Definition: Interconnection Relation. Let G = (V,£,«) be a graph or digraph. 
Define a binary relation <g@ on the vertex set V by setting u <a v iff there exist walks in 
G from u to v and from v to u. 


In the case of graphs, note that ug w iff there is a walk in G from u to w, since the 
reversal of such a walk is a walk in G from w to u. Now, for a graph or digraph G, let us 
verify that <q is indeed an equivalence relation on V. First, for all u € V, (w) is a walk of 
length 0 from u to u, sou +g u and “4G is reflexive on V. Second, the symmetry of <¢@ is 
automatic from the way we defined ~g¢: ug v implies v <g u for all u,v € V. Finally, 
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to check transitivity, suppose u,v,w € V satisfy u og v and vu @q w. Let W,, Wa, W3, 
and W, be walks in G from u to v, from v to w, from v to u, and from w to v, respectively. 
Then the concatenation of W, followed by W, is a walk in G from u to w, whereas the 
concatenation of W, followed by W3 is a walk in G from w to u. Hence u @@ w, as desired. 


3.56. Definition: Components. Let G = (V, E,¢) bea graph or digraph. The components 
of G are the equivalence classes of the interconnection equivalence relation ~¢. Components 
are also called connected components or (in the case of digraphs) strong components. 


Since ~@ is an equivalence relation on V, the components of G form a set partition 
of the vertex set V. Given a component C' of G, consider the graph or digraph (C, E’, €’) 
obtained by retaining those edges in EF with both endpoints in C' and restricting € to this 
set of edges. One may check that this graph or digraph is connected. 


3.57. Example. The components of the graph G2 in Figure 3.1 are {0} and {1, 2,3, 4, 5, 6}. 
The components of the digraph G3 in that figure are {1,2,4,5}, {3}, {6}, and {7}. 


The next theorems describe how the addition or deletion of an edge affects the compo- 
nents of a graph. 


3.58. Theorem: Edge Deletion and Components. Let G = (V, E,«) be a graph with 
components {C; : i € I}. Let e € E be an edge with endpoints v, w € C;. Let G’ = (V, E’, €’) 
where E’ = E ~ {e} and é’ is the restriction of € to E’. 

(a) If e appears in some cycle of G, then G and G’ have the same components. 

(b) If e appears in no cycle of G, then G’ has one more component than G. More precisely, 
the components of G’ are the C, with k 4 7, together with two disjoint sets A and B such 
that AUB =C;,v € A, and w € B. 


Proof. For (a), let (vo, €1, U1, €2,---,Us) be a cycle of G containing e. Cyclically shifting and 
reversing the cycle if needed, we can assume vp = U = Us, €1 = €, and v; = w. Statement (a) 
will follow if we can show that the interconnection relations ~@ and «q@ coincide. First, 
for all y,z € V, y Gq z implies y @@ z since every walk in the smaller graph G’ is also a 
walk in G. On the other hand, does y ~g z imply y ~q@ z? We know there is a walk W 
from y to z in G. If W does not use the edge e, W is a walk from y to z in G’. Otherwise, 
we can modify W as follows. Every time W goes from v = vs = vo to w = v1 Via e, replace 
this part of the walk by the sequence (vs, €s,...,€2,U1) obtained by taking a detour around 
the cycle. Make a similar modification each time W goes from w to v via e. This produces 
a walk in G’ from y to z. 

For (b), let us compute the equivalence classes of qr. First, fix z € Cy, where k # j. 
The set C;, consists of all vertices in V reachable from z by walks in G. One readily checks 
that none of these walks can use the edge e, so C; is also the set of all vertices in V reachable 
from z by walks in G’. So Cy is the equivalence class of z relative to both og and «cq. 

Next, let A and B be the equivalence classes of v and w (respectively) relative to @q@. 
By definition, A and B are two of the components of G’ (possibly the same component). 
We now show that A and B are disjoint and that their union is C;. If the equivalence 
classes A and B are not disjoint, then they must be equal. By 3.52, there must be a path 
(vo, €1,U1,---,€s,Us) in G’ from v to w. Appending e, vp to this path would produce a cycle 
in G involving the edge e, which is a contradiction. Thus A and B are disjoint. Let us show 
that AU BC C;. If z € A, then there is a walk in G’ (and hence in G) from v to z. Since 
C; is the equivalence class of v relative to <q, it follows that z € Cj. Similarly, z € B 
implies z € Cj since C; is also the equivalence class of w relative to @g. Next, we check 
that C; C AUB. Let z € Cj, and let W = (wo, e1, w1,..., we) be a walk in G from v to z. If 
W does not use the edge e, then z € A. If W does use e, then the portion of W following the 
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last appearance of the edge e is a walk from either v or w to z in G’; thus z € AU B. Since 
the union of A, B, and the C, with k ¥ 7 is all of V, we have found all the components of 
G’. O 


The previous result suggests the following terminology. 


3.59. Definition: Cut-Edges. An edge e in a graph G is a cut-edge iff e does not appear 
in any cycle of G. 

3.60. Theorem: Edge Addition and Components. Let G = (V, E,«) be a graph with 
components {C; :7 € I}. Let Gt = (V, Et, e+) be the graph obtained from G by adding a 
new edge e with endpoints v € C; and w € Cx. 

(a) If v and w are in the same component C; of G, then e is involved in a cycle of G*, and 
G and Gt have the same components. 

(b) If v and w are in different components of G, then e is a cut-edge of Gt, and the 
components of Gt are Cj; UC, and the C; with i 4 j,k. 


This theorem follows readily from 3.58, so we leave the proof to the reader. 


DT 


3.9 Forests 


3.61. Definition: Forests. A forest is a graph with no cycles. Such a graph is also called 
acyclic. 


A forest cannot have any loops or multiple edges between the same two vertices. So we 
can assume, with no loss of generality, that forests are simple graphs. 


3.62. Example. Figure 3.11 displays a forest. 


Recall from 3.54 that a graph G is connected iff there exists at least one path between 
any two vertices of G. The next result gives an analogous characterization of forests. 


3.63. Theorem: Forests and Paths. A graph G is acyclic iff G has no loops and for all 
u,v in V(G), there is at most one path from wu to v in G. 


Proof. We prove the contrapositive in both directions. First suppose that G has a cycle 
C = (v0, €1,U1,---,€s; Us). If s = 1, G has a loop. If s > 1, then (v1, €2,...,€s,Us) and 


FIGURE 3.11 
A forest. 
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the cycle 


FIGURE 3.12 
Extracting a cycle from two paths. 


(v1, €1, Uo) are two distinct paths in G from v1 to vo. For the converse, we can assume G' is 
simple. Suppose u and v are vertices of G and P = (a0, %1,...,%s), Q = (Yo, y1,---, Ye) are 
two distinct paths in G from wu to v, where all «;’s and y,’s are in V(G). We will use these 
paths to construct a cycle in G. Note that the concatenation of P and the reversal of @ is a 
walk in G that starts and ends at u. But this walk need not be a cycle, since it may involve 
repeated edges or vertices. 

Since P and Q are distinct, there is an index 7 such that x; = y; for 0 <i < j, but 
Lj41 A yjt1- Since P and @ both end at v, there must exist a least index k > j + 1 such 
that x, is a vertex in Q, say x, = y,. It follows from the choice of 7 and k that either 
k=jt+landr>j+1l,ork>j+1landr>j+1. In any case, 


C= bas raat 11+) Uk = Urs Yr-1,--- UF y) 
is a cycle in G. Figure 3.12 illustrates this argument. O 
The following result gives a formula for the number of components in a forest. 


3.64. Theorem: Components of a Forest. Let G be a forest with n vertices and k 
edges. The number of connected components of G is n — k. 


Proof. We use induction on k. The result holds for k = 0, since G consists of n isolated 
vertices in this case. Assume that & > 0 and the result is already known for forests with n 
vertices and k — 1 edges. Given a forest G with n vertices and k edges, remove one edge e 
from G' to get a new graph H. The graph A is acyclic and has n vertices and k — 1 edges. 
By induction, H has n — (k — 1) =n—k+1 components. On the other hand, e must be a 
cut-edge since G has no cycles. It follows from 3.58 that H has one more component than 
G. Thus, G has n — k components, as desired. O 


DS 


3.10 Trees 


3.65. Definition: Trees. A tree is a connected graph with no cycles. 
3.66. Example. Figure 3.13 displays a tree. 


Every component of a forest is a tree, so every forest is a disjoint union of trees. The 
next result is an immediate consequence of 3.37. 
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FIGURE 3.13 
A tree. 
. Vo oh 
delete to get G’ 
FIGURE 3.14 


Pruning a leaf from a tree gives another tree. 


3.67. Theorem: Trees Have Leaves. If T is a tree with more than one vertex, then T 
has at least two leaves. 


3.68. Definition: Pruning. Suppose G = (V, FE) is a simple graph, vo is a leaf in G, and 
€o is the unique edge incident to the vertex v9. The graph obtained by pruning vo from G is 
the graph (V ~ {uo}, EF ~ {eo}). 


3.69. Pruning Lemma. If T is an n-vertex tree, vp is a leaf of T, and T’ is obtained from 
T by pruning vo, then T”’ is a tree with n — 1 vertices. 


Proof. First, T has no cycles, and the deletion of vp and the associated edge eg will not 
create any cycles. So T’ is acyclic. Second, let u,w € V(T’). There is a path from u to w 
in T. Since uF vo # w, this path will not use the edge eo or the vertex vo. Thus there is a 
path from u to w in T’, so T’ is connected. O 


Figure 3.14 illustrates the pruning lemma. 
To illustrate an application of pruning, we now prove a fundamental relationship between 
the number of vertices and edges in a tree (this also follows from 3.64). 


3.70. Theorem: Number of Edges in a Tree. If G is a tree with n > 0 vertices, then 
G has n — 1 edges. 


Proof. We argue by induction on n. If n = 1, then G must have 0 = n — 1 edges. Assume 
n > 1 and that the result holds for trees with n — 1 vertices. Let T be a tree with n vertices. 
We know that T has at least one leaf; let vg be one such leaf. Let T’ be the graph obtained 
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by pruning vp from T. By the pruning lemma, T” is a tree with n—1 vertices. By induction, 
T’ has n — 2 edges. Hence, T has n — 1 edges. O 


3.71. Theorem: Characterizations of Trees. Let G be a graph with n vertices. The 
following conditions are logically equivalent: 

(a) G is a tree (i.e., G is connected and acyclic). 

(b) G is connected and has < n — 1 edges. 

(c) G is acyclic and has > n — 1 edges. 

(d) G has no loop edges, and for all u,v € V(G), there is a unique path in G from uw to v. 
Moreover, when these conditions hold, G has n — 1 edges. 


Proof. First, (a) implies (b) and (a) implies (c) by 3.70. Second, (a) is equivalent to (d) by 
virtue of 3.54 and 3.63. Third, let us prove (b) implies (a). Assume G is connected with 
k <n-—1 edges. If G has a cycle, delete one edge on some cycle of G. The resulting graph 
is still connected (by 3.58) and has k — 1 edges. Continue to delete edges in this way, one 
at a time, until there are no cycles. If we deleted 7 edges total, the resulting graph is a tree 
with k —1 < n—1-—i edges and n vertices. By 3.70, we must have i = 0 and k = n— 1. So 
no edges were deleted, and G itself is in fact a tree. 

Fourth, let us prove (c) implies (a). Assume G is acyclic with k > n — 1 edges. If G is 
not connected, add an edge joining two distinct components of G. The resulting graph is 
still acyclic (by 3.60) and has k +1 edges. Continue to add edges in this way, one at a time, 
until the graph becomes connected. If we added i edges total, the resulting graph is a tree 
with k +7 >n—1+% edges and n vertices. By 3.70, we must have i = 0 and k = n—1. So 
no edges were added, and G itself is in fact a tree. O 


|= Ee 


3.11 Counting Trees 


The next theorem, usually attributed to Cayley, counts n-vertex trees. 


3.72. Theorem: Enumeration of Trees. For all n > 1, there are n”~? trees with vertex 
set {1,2,...,n}. 


3.73. Example. Figure 3.15 displays all 44~? = 16 trees with vertex set {1,2,3, 4}. 
Theorem 3.72 is an immediate consequence of 3.47 and the following bijection. 


3.74. Theorem: Trees vs. Rooted Trees. Let V be a finite set and v9 € V. There is a 
bijection from the set A of trees with vertex set V to the set B of rooted trees with vertex 
set V and root vo. 


Proof. We define maps f : A— B and g: B — A that are two-sided inverses. First, given 
T = (V,E) € A, construct f(T) = (V, £’) as follows. For each v € V with uv ¥ vo, there 
exists a unique path from v to vp in T. Letting e = {v,w} be the first edge on this path, 
we add the directed edge (v,w) to E’. Also, we add the loop edge (vo, vp) to E’. Since T 
has no cycles, the only possible cycle in the resulting functional digraph f(T) is the 1-cycle 
(vo). It follows that f(T’) is a rooted tree on V with root vo (see 3.42). 

Next, given a rooted tree S € B, define g(S) by deleting the unique loop edge (vo, vo) 
and replacing every directed edge (v, w) by an undirected edge {v, w}. The resulting graph 
g(S'}) has n vertices and n — 1 edges. To see that g(S) is connected, fix y, z € V. Following 
outgoing edges from y (resp. z) in S produces a directed path from y (resp. z) to vo in S. 
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2 2 3 3 4 4 1 1 3 4 1 2 
3 4 2 4 2 3 3 4 1 1 2 1 
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FIGURE 3.15 


The 16 trees on four vertices. 


In the undirected graph g(S'}), we can concatenate the path from y to vp with the reverse of 
the path from z to vp to get a walk from y to z. It follows that g(S) is a tree. 

It is routine to check that go f = ida, since f assigns a certain orientation to each 
edge of the original tree, and this orientation is then forgotten by g. It is somewhat less 
routine to verify that f og = idp; we leave this to the reader. (One must check that the 
edge orientations in f(g(S)) agree with the edge orientations in S, for each S € B.) O 


A different bijective proof of Cayley’s theorem, which employs parking functions, is 
presented in §12.5. We next prove a refinement of Cayley’s theorem that counts the number 
of trees such that each vertex has a specified degree. We give an algebraic proof first, and 
then convert this to a bijective proof in the next section. 


3.75. Theorem: Counting Trees with Specified Degrees. Suppose n > 2 and 
d,,...,dy => O are fixed integers. If dj + ---+d, = 2n — 2, then there are 


n—2 = a a2)! 
dy = 150g —1j.027dn —1f Ty (a — 0! 


trees with vertex set {v1, v2,...,Un} such that deg(v;) = d; for all 7. Ifdi+---+d, 4 2n—-2, 
then there are no such trees. 


Proof. The last statement holds because any tree T’ on n vertices has n — 1 edges, and thus 
yyy, deg(vi) = 2(n — 1). Assume henceforth that dj +--+ +d, = 2n — 2. We prove the 
result by induction on n. First consider the case n = 2. If dj = dz = 1, there is exactly one 


valid tree, and (Gena a) = 1. For any other choice of d;, dz adding to 2, there are no valid 


trees, and Gas) — 0. 

Now assume n > 2 and that the theorem is known to hold for trees with n — 1 vertices. 
Let A be the set of trees T with V(T) = {v1,...,un} and deg(v;) = d; for all j. If dj = 0 for 
some j, then A is empty and the formula in the theorem is zero by convention. Now suppose 
d; > 0 for all 7. We must have d; = 1 for some 2, for otherwise dj +---+dn > 2n > 2n—2. 
Fix an 7 with d; = 1. Note that vu; is a leaf in T for every T € A. Now define 


Ap ={T EA: {vi, ue} © E(T)} (l<k<n,k Fi). 
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Ax is the set of trees in A in which the leaf v; is attached to the vertex vz. A is the disjoint 
union of the sets Ax. 

Fix k # i. Pruning the leaf v; gives a bijection between A; and the set By of all 
trees with vertex set {v1,...,Ui;-1, Vit1,---,Un} such that deg(v;) = d; for 7 # i,k and 
deg(vz,) = dy — 1. By induction hypothesis, 


(n — 3)! 


(dy — 2)! Tli<j<n(dj — 1)! 
J#i,k 


|By| = 


Therefore, 
al = Soll = ae — 
Gare (= Dtor= 1) 
= TCE ag 


Now, since dj = 1, )),4;(de — 1) = rei (dk — 1) = (2n — 2) —n = n — 2. Inserting this 
into the previous formula, we see that 


(n — 2)! 
[Tj-1 (4; _ 1)!’ 


which completes the induction proof. O 


|A| = 


3.76. Corollary: Second Proof of 3.72. Let us sum the previous formula over all possible 
degree sequences (d,,...,d,). Making the change of variables c; = d; — 1 and invoking the 
multinomial theorem 2.12, we see that the total number of trees on this vertex set is 


n—2 n—2 
= 1°12 Shae St 
ys ee oe ea x Sa 


di+---+dyp=2n—2 Cite +en=n—2 
d,>0 ci 0 


= (l+1+-+-+1)"7 =n, 


3.12 Pruning Maps 


We now develop a bijective proof of 3.75. Suppose n > 2 and dj,..., dn are positive integers 
that sum to 2n — 2. Let V = {v1,...,Un} be a vertex set consisting of n positive integers. 
Let A be the set of trees T with vertex set V such that deg(v;) = d; for all 7. Let B be 
the set of words R(v@?~'vg2~1..- vd!) as in 1.44. Each word w € B has length n — 2 
and consists of dj — L copies of v; for all 7. To prove 3.75, it suffices to define a bijection 
f: AB. 

Given a tree T € A, we compute f(T) by repeatedly pruning off the largest leaf of T, 
recording for each leaf the vertex adjacent to it in JT. More formally, for i ranging from 
1 to n—1, let x be the largest leaf of T; define w; to be the unique neighbor of x in T; 
then modify T by pruning the leaf x. This process produces a word w,---W,y_1; we define 
f(T) = W1°*:Wn-2.- 
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FIGURE 3.16 
A tree with (di,...,d9) = (1,2,1,1,1,2,3, 1,4). 


3.77. Example. Let T be the tree shown in Figure 3.16. To compute f(T), we prune leaves 
from T in the following order: 8, 5, 4,9, 6,3, 2,7. Recording the neighbors of these leaves, we 
see that w = f(T) = 9996727. Observe that the algorithm computes w,_-1 = 1, but this 
letter is not part of the output word w. Also observe that w € R(1°213°495°61728°93) = 
R(1G-1...gdo-2), 


The observations in the last example hold in general. For, given any T € A, repeatedly 
pruning leaves from T will produce a sequence of smaller trees, by the pruning lemma 3.69. 
By 3.67, each such tree (except the last tree) has at least two leaves, so vertex v1 will never 
be chosen for pruning. In particular, v; is always the last vertex left, so that w,_1 is always 
v1. Furthermore, if v; is any vertex different from v1, then the number of occurrences of 
v7; IN Wy W2-++Wy—1 is exactly d; — 1. For, every time a pruning operation removes an edge 
touching v;, we set w; = v; for some i, except when we are removing the last remaining edge 
touching v; (which occurs when v; has become the largest leaf and is being pruned). The 
same reasoning shows that v; (which never gets pruned) appears d; times in w1--+Wy-1. 
Since wn_1 = v1, every vertex v; occurs d; — 1 times in the output word wy ---Wp_2. 

To see that f is a bijection, we argue by induction on the number of vertices. The 
result holds when n = 2, since in this case, A consists of a single tree with two nodes, 
and B consists of a single word (the empty word). Now suppose n > 2 and the maps f 
(defined for trees with fewer than n vertices) are already known to be bijections. Given 
W = W1'':Wn-2 € B, we will show there exists exactly one T € A with f(T) = w. If 
such T exists, the leaves of T are precisely the vertices in V(T) that do not appear in w. 
Thus, the first leaf that gets pruned when computing f(T) must be the largest element z 
of V(T) ~ {w1,...,Wn—2}. By induction hypothesis, there exists exactly one tree T’ on the 
vertex set V(T) ~ {z} (with the appropriate vertex degrees) such that f(T’) = w2-++Wn_2. 
This given, we will have f(T) = w iff T is the tree obtained from T’ by attaching a new leaf 
z as a neighbor of vertex w,. One readily confirms that this graph is in A (i.e., the graph is 
a tree with the correct vertex degrees). This completes the induction argument. The proof 
also yields a recursive algorithm for computing f~'(w). The key point is to use the letters 
not seen in w (and its suffixes) to determine the identity of the leaf that was pruned at each 
stage. 


3.78. Example. Given w = 6799297 and V = {1,2,...,9}, let us compute the tree f~!(w) 
with vertex set V. The leaves of this tree must be {1,3,4,5,8}, which are the elements of 
V not seen in w. Leaf 8 was pruned first and was adjacent to vertex 6. So now we must 
compute the tree f~1(799297) with vertex set V ~ {8}. Here, leaf 6 was pruned first and 
was adjacent to vertex 7. Continuing in this way, we deduce that the leaves were pruned 
in the order 8,6,5,4,3,2,9,7; and the neighbors of these leaves (reading from w) were 
6,7,9,9,2,9,7,1. Thus, f~!(w) is the tree shown in Figure 3.17. 
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6 9 2 
7 4 

1 


FIGURE 3.17 
The tree associated to w = 6799297 by fot. 


3.13. Ordered Trees and Terms 


In this section, we study combinatorial structures called ordered trees, which are defined 
recursively as follows. 


3.79. Definition: Ordered Trees. The symbol 0 is an ordered tree. If mn > O and 
T,,..-,;In is a sequence of ordered trees, then the (n + 1)-tuple 


(n, Ti, To, sets »Tn) 


is an ordered tree. All ordered trees arise by applying these two rules a finite number of 
times. 


We often think of ordered trees pictorially. The ordered tree 0 is depicted as a single 
node. The ordered tree (n, 71, T2,...,T,) is drawn by putting a single “root node” at the 
top of the picture with n edges leading down. At the ends of these edges, reading from 
left to right, we recursively draw pictures of the trees T,, 7>,..., 7, in this order. The term 
“ordered tree” emphasizes the fact that the left-to-right order of the children of each node is 
significant. Note that an ordered tree is not a tree in the graph-theoretic sense, and ordered 
trees are not the same as rooted trees. 


3.80. Example. Figure 3.18 illustrates the ordered tree 


T = (4, (2,0, (1, 0)), 0, (3, 0, (3, 0, 0, 0), 0), 0). 


FIGURE 3.18 
Picture of an ordered tree. 
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Ordered trees can be used to model algebraic expressions that are built up by applying 
functions to lists of arguments. For example, the tree T in the previous example represents 
the syntactic structure of the following algebraic expression: 


f(9(#1, h(x2)), £3, k(4, j(@5, £6, 77), 2g), Lg). 


More specifically, if we replace each function symbol f,g,h,k,j by its arity (number of 
arguments) and replace each variable x; by zero, we obtain 


4(2(0, 1(0)), 0, 3(0, 3(0, 0, 0), 0), 0). 


This becomes T' if we move each left parenthesis to the left of the positive integer immedi- 
ately preceding it and put a comma in its original location. 

Surprisingly, it turns out that the syntactic structure of such an algebraic expression 
is uniquely determined even if we erase all the parentheses. To prove this statement, we 
introduce a combinatorial object called a term that is like an ordered tree, but contains 
no parentheses. For example, the algebraic expression above will be modeled by the term 
42010030300000. 


3.81. Definition: Words and Terms. A word is a finite sequence of natural numbers. 
We define terms recursively as follows. The word 0 is a term. If n > 0 and 7}, 7o,...,Tn 
are terms, then the word n7,T>---T;, is a term. All terms arise by applying these two rules 
a finite number of times. 


We see from this definition that every term is a nonempty word. 


3.82. Definition: Weight of a Word. Given a word w = wiw2:::ws, the weight of w is 
wt(w) = wi + we+--++ws— 8. 


For example, wt(42010030300000) = 13 — 14 = —1. Note that wt(vw) = wt(v) + wt(w) 
for all words v,w. The next result uses weights to characterize terms. 


3.83. Theorem: Characterization of Terms. A word w = w)w2:::ws is a term iff 
wt(w) = —1 and wt(w,we--- wz) > 0 forall k <s. 


Proof. We argue by strong induction on the length s of the word. First suppose w = 
WwW W2++:wWs is a term of length s. If w = 0, then the weight condition holds. Otherwise, 
we must have w = n7\7>---T, where n > 0 and 7),72,...,7Zp are terms. Since each T; 
has length less than s, the induction hypothesis shows that wt(T;) = —1 and every proper 
prefix of T; has nonnegative weight. So, first of all, wt(w) = wt(n)+wt(T1)+---+wt(T,) = 
(n — 1) —n = —1. On the other hand, consider a proper prefix wiw2---w, of w. If k = 1, 
the weight of this prefix is n — 1, which is nonnegative since n > 0. If k > 1, we must 
have wyw2--- wr = nT,---T;z where 0 <i <n and z is a proper prefix of T;,,. Using the 
induction hypothesis, the weight of w we --- w, is therefore (n—1)—i+wt(z) > (n—i)—1 > 0. 

For the converse, we also use strong induction on the length of the word. Let w = 
W1W2++:Ws Satisfy the weight conditions. The empty word has weight zero, so s > 0. If 
s = 1, then wt(w1) = —1 forces w = 0, so that w is a term in this case. Now suppose 
s > 1. The first symbol w; must be an integer n > 0, lest the proper prefix w, of w have 
negative weight. Observe that appending one more letter to any word decreases the weight 


by at most 1. Since wt(w1) =n — 1 and wt(wiwe---ws) = —1, there exists a least integer 
ky with wt(wiwe---wr,) = n — 2. Now if n > 2, there exists a least integer kz > ky with 
wt(w1w2-+-wr,) = n— 3. We continue similarly, obtaining integers ky < ky <--- < ky 


such that k; is the least index following kj; such that wt(wiw2---wr,) = n-— 1-2. 
Because w satisfies the weight conditions, we must have k, = s. Now define n subwords 
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Ty = wow3:++ We, Tl = Wey 41Wky42°°* Whos ee) In = Weep 41°'' Ws. Evidently w = 
nT, T>---T,. For alli <n, nT\T,---T;_1 has weight n—7, nT,T>---T; has weight n—i-1, 
and (by minimality of k;) no proper prefix of nT\T>---T; has weight less than n — i. It 
follows that T; has weight —1 but every proper prefix of T; has nonnegative weight. Thus 
each T; satisfies the weight conditions and has length less than w. By induction, every T; is 
a term. Then w = nT,72---T), is also a term, completing the induction. O 


3.84. Corollary. No proper prefix of a term is a term. 


3.85. Theorem: Unique Readability of Terms. For every term w, there exists a unique 
integer n > 0 and unique terms 7),...,7;, such that w = nT, --- Th. 


Proof. Existence follows from the recursive definition of terms. We prove uniqueness by 
induction on the length of w. Suppose w = nT, ---T, = mT{---T/, where n,m > 0 and 
every T; and Ty is a term. We must proven = mand T; = Tj for alli < n. First,n = w1 =m. 
If T; A Tj, then one of T; and T] must be a proper prefix of the other, in violation of the 
preceding corollary. So T; = T;. Then if Tz # TS, one of T> and T, must be a proper prefix 
of the other, in violation of the corollary. Continuing similarly, we see that T; = T/ for all 
i. oO 


Using the previous theorem and induction, one readily proves that erasing all parentheses 
defines a bijection from ordered trees to terms. Therefore, to enumerate various collections of 
ordered trees, it suffices to enumerate the corresponding collections of terms. This technique 
will be used in the next section. 


Teme 


3.14 Ordered Forests and Lists of Terms 


We continue our study of ordered trees and terms by introducing two more general objects: 
ordered forests and lists of terms. 


3.86. Definition: Ordered Forests. For n > 0, an ordered forest of n trees is a list 
(T1, T2,...,Tn), where each T; is an ordered tree. 


3.87. Definition: Lists of Terms. For n > 0, a list of n terms is a word w of the form 
w = T,T>---T,, where each T; is a term. 


3.88. Theorem: Weight Characterization of Lists of Terms. A word w = w,w2:--ws 
is a list of n terms iff wt(w) = —n and wt(wiw2--- wr) > —n for all k < s. 


Proof. First suppose w is a list of n terms, say w = 7T17>---T,. Then nw = nT\T2---T, 
is a single term. This term has weight —1, by 3.83, so w has weight —1 — wt(n) = —n. If 
wt(w1---w,) < —n for some k < s, then the proper prefix nw, --- wz of the term nw would 
have negative weight, contradicting 3.83. 

Conversely, suppose w satisfies the weight conditions in 3.88. Then the word nw satisfies 
the weight conditions in 3.83, as one immediately verifies. So nw is a term, which must have 
the form n7,7>---T,, for suitable terms 7),...,7,. Then w = 7,7 >---T), is a list of n 
terms. O 


3.89. Theorem: Unique Readability of Lists of Terms. If w = 7,T2---T), is a list of 
n terms, then n and the terms J; are uniquely determined by w. 
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Proof. First, n = — wt(w) is uniquely determined by w. To see that the T; are unique, add 
an n to the beginning of w and then appeal to 3.85. O 


We deduce that erasing parentheses gives a bijection between ordered forests of n trees 
and lists of n terms. 
The next lemma reveals a key property that will allow us to enumerate lists of terms. 


3.90. Cycle Lemma for Lists of Terms. Suppose w = w,w2---ws is a word of weight 
—n <0. There exist exactly n indices i < s such that the cyclic rotation 


R,(w) = Wi Wi41 °° * WsW1W2 °° * Wi-1 
is a list of n terms. 


Proof. Step 1: We prove the result when w itself is a list of n terms. Say w = T\7>---Ty, 
where Tj is a term of length k;. Then R,(w) is a list of n terms for the n indices i € 
{l,ki+1,ki tho +1,...,h1 + ko +--+ +kn_-1 +1}. Suppose ¢ is another index (different 
from those just listed) such that R;(w) is a list of n terms. For some 7 < n, we must have 


Ri(w) = yT 41 ve Trl) ots ce Tj-12 


where T; = zy and z,y are nonempty words. Since wt(z) > 0 but wt(Z;) = —1, we must 
have wt(y) < 0. So 


wt(yZ541 +++ Tj-1) = wt(y) + wt(Tj41) +--+ + wt(Zj-1) < —(n— 1). 


Then yTj41--:T;-1 is a proper prefix of R;(w) with weight < —n, in violation of 3.88. 

Step 2: We prove the result for a general word w. It suffices to show that there exists at 
least one i such that R;(w) is a list of n terms. For then, since we obtain the same collection 
of words by cyclically shifting w and R;(w), the desired result will follow from Step 1. 

First note that all cyclic rotations of w have weight yet wt(w;) = wt(w) = —n. Let m 
be the minimum weight of any prefix w,w2--- wz of w, where 1 < k < s. Choose k minimal 
such that wt(wiw2---we) =m. If k = s, then m = —n, and by minimality of k and 3.88, 
w itself is already a list of n terms. Otherwise, let i = k +1. We claim R;(w) is a list of n 
terms. It suffices to check that each proper prefix of R;(w) has weight > —n. On one hand, 
for all j with i < 7 < s, the prefix w;---w, of R;(w) cannot have negative weight; otherwise, 
wt(w1 +: wewi--- wz) <m violates the minimality of m. So wt(w,;---w;) > 0 > —n. Note 
that when j = s, we have wt(w;---ws) = wt(w) — wt(w1--- we) = —n — m. Now consider 
j in the range 1 <j <k. If wt(wj---wswi---w;) < —n, then 


wt(w1--- wy) = wt(wi---wswi-:-w;) — wt(wy--- ws) < —n- (—n-—m) =m. 


But this violates the choice of k as the least index such that the prefix ending at k has 
minimum weight. So wt(w;--:wswi-:-w,;) > —n. It now follows from 3.88 that R;(w) is 
indeed a list of n terms. O 


Suppose w is a list of n terms containing exactly k; occurrences of 7 for each i > 0. We 


have 
—n = wt(w) = 5) ki wt(é) = Si -— 1k = —ko + SOG - Di. 
i>0 i>0 i>1 
It follows that ko = n+ ¥°,.,(i— 1)ki in this situation. Conversely, if ko satisfies this 
relation, then wt(w) = —n for all w € R(0*°1*:2*2 ...). We now have all the ingredients 
needed for our main enumeration result. 


Counting Problems in Graph Theory 127 


3.91. Theorem: Enumeration of Lists of Terms. Let n > 0 and ko, ky,...,k, be given 
natural numbers such that kp = n+ See (i —1)k;. The number N of words w such that w 
is a list of n terms containing k; copies of 7 for 0 <i <t is 


n s seat! 
Ss Kogkiabesskhi ~~ kolkyl s+ kyl? 


where s = oa kj =n+ ee ik, is the common length of all such words. 


Proof. Step 1. Let A be the set of all pairs (w, 7), where w € R(0*01" ---t*+) is a word and 
j <-s is an index such that the cyclic rotation R;(w) is a list of n terms. Combining 3.90 
and 1.46, we see that |A| = oe ae 

Step 2. Let B be the set of all pairs (w,i) where w € R(0*1*: --- t*) is a list of n terms 
(necessarily of length s) and 1 <i < s. By the product rule, |B| = sN. 

Step 3. To complete the proof, we show that |A| = |B| by exhibiting mutually inverse 
bijections f: A— Band g: B— A. We define f(w,j) = (R;(w), 7) for all (w,7) € A, and 
g(w, i) = (R7*(w),4) for all (w,7) € B. Oo 
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3.15 Graph Coloring 


This section introduces the graph coloring problem and some of its applications. 


3.92. Definition: Colorings. Let G = (V, £) be a simple graph, and let C be a finite set. 
A coloring of G using colors in C is a function f : V — C. A coloring f of G is a proper 
coloring iff for every edge {u,v} € E, f(u) A f(v). 


Intuitively, we are coloring each vertex of G using one of the available colors in the set 
C. For each v € V, f(v) is the color assigned to vertex v. A coloring is proper iff no two 
adjacent vertices in G are assigned the same color. 


3.93. Definition: Chromatic Functions and Chromatic Numbers. Let G be a simple 
graph. For each positive integer x, let yq(x) be the number of proper colorings of G using 
colors in {1,2,...,a}. The function yg : Nt — N is called the chromatic function of G. 
The minimal x such that x¢(x) > 0 is called the chromatic number of G. 


The chromatic number is the least number of colors required to obtain a proper coloring 
of G. The function yg is often called the chromatic polynomial of G because of 3.100 below. 


3.94. Example. Suppose G is a simple graph with n vertices and no edges. Then yg(x) = 
x” since we can assign any of the x colors to each vertex. The chromatic number for this 
graph is 1. 


3.95. Example. At the other extreme, suppose G is a simple graph with n vertices such 
that there is an edge joining every pair of distinct vertices. Color the vertices one at a time. 
The first vertex can be colored in « ways. The second vertex must have a color different 
from the first, so there are x—1 choices. In general, the ith vertex must have a color distinct 
from all of its predecessors, so there are x — (i — 1) choices for the color of this vertex. The 
product rule gives y¢(x) = a(x — 1)(# — 2)---(w@ —n +1) = (a) n. The chromatic number 
for this graph is n. Recall from 82.13 that 


n 


(2) n= S- s(n, k)a*, 


k=1 
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so that the function yg in this example is a polynomial whose coefficients are the signed 
Stirling numbers of the first kind. 


3.96. Example: Cycles. Consider the simple graph 


G = ({1, 2,3, 4}, {{1, 2}, {2, 3}, (3, 4}, {4, 1}}). 


G consists of four vertices joined in a 4-cycle. We might attempt to compute yg(z) via the 
product rule as follows. Color vertex 1 in x ways. Then color vertex 2 in x—1 ways, and color 
vertex 3 in x — 1 ways. We run into trouble at vertex 4, because we do not know whether 
vertices 1 and 3 were assigned the same color. This example shows that we cannot always 
compute yg by the product rule alone. In this instance, we can classify proper colorings 
based on whether vertices 1 and 3 receive the same or different colors. If they receive the 
same color, the number of proper colorings is x(a —1)(%—1) (color vertices 1 and 3 together, 
then color vertex 2 a different color, then color vertex 4 a different color from 1 and 3). If 
vertex 1 and 3 receive different colors, the number of proper colorings is x(a—1)(#—2)(x—2) 
(color vertex 1, then vertex 3, then vertex 2, then vertex 4). Hence 


Xq(z) = a(x — 1)(2 — 1) + 2(@ — 1)(2 — 2)(2 -— 2) = x* — 4? + 62? — 3a. 


The chromatic number for this graph is 2. 

More generally, consider the graph C;, consisting of n vertices joined in a cycle. It is 
routine to establish that the chromatic number of C,, is 1 for n = 1, is 2 for all even n, and 
is 3 for all odd n > 1. On the other hand, it is not immediately evident how to compute 
the chromatic function for C;,, when n > 4. We will deduce a recursion for these functions 
shortly as a special case of a general recursion for computing chromatic functions. 


Here is an application that can be analyzed using graph colorings and chromatic num- 
bers. Suppose we are trying to schedule meetings for a number of committees. If two com- 
mittees share a common member, they cannot meet at the same time. Consider the graph 
G whose vertices represent the various committees, and where there is an edge between two 
vertices iff the corresponding committees share a common member. Suppose there are x 
available time slots in which meetings may be scheduled. A coloring of G with x colors rep- 
resents a particular scheduling of committee meetings to time slots. The coloring is proper 
iff the schedule creates no time conflicts for any committee member. The chromatic number 
is the least number of time slots needed to avoid all conflicts, while yg(x) is the number of 
different conflict-free schedules using x (distinguishable) time slots. 


3.97. Example. Six committees have members as specified in the following table. 


Committee | Members 

Kemp, Oakley, Saunders 
Gray, Saunders, Russell 
Byrd, Oakley, Quinn 
Byrd, Jenkins, Kemp 
Adams, Jenkins, Wilson 
Byrd, Gray, Russell 


Figure 3.19 displays the graph G associated to this set of committees. To compute yq(z), 
consider cases based on whether vertices A and F receive the same color. If A and F are 
colored the same, the number of proper colorings is (a — 1)(a# — 2)(a — 1)(a — 1) [color 
A and F, then C, D, B, and E]. If A and F receive different colors, the number of proper 
colorings is 7(a — 1)(a — 2)(a — 3)(a — 2)(a — 1) [color A, F, C, D, B, E]. Thus, 


xa(z) = x(x — 1)? (# — 2)(2@ —1 + (@ — 2)(x@ — 3)) = 2 — 82° + 2624 — 422° + 33x — 102. 


The chromatic number of G is 3. 
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D 
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FIGURE 3.19 

Conflict graph for six committees. 

original graph graph after collapsing 
e 
0 


FIGURE 3.20 
Collapsing an edge in a simple graph. 


We are about to present a general recursion that can be used to compute the chromatic 
function of a simple graph. The recursion makes use of the following construction. 


3.98. Definition: Collapsing an Edge. Let G = (V,£) be a simple graph, and let 
€o = {vo, Wo} be a fixed edge of G. Let zo be a new vertex. We define a simple graph 
HT called the graph obtained from G by collapsing the edge eo. The vertex set of AH is 
(V ~ {vo, wo}) U {zo}. The edge set of H is 


{{a,y}:2 # ty xy and « £ wm £y and {2,y} € B} 
U{{x, zo}: # vo and {x, wo} € E} 
U{{a, zo}: # wo and {x, vo} € E}. 


Pictorially, we construct H from G by shrinking the edge eo until the vertices vg and wo 
coincide. We replace these two overlapping vertices with a single new vertex zo. All edges 
touching vp or wo (except the collapsed edge eg) now touch zp instead. See Figure 3.20. 


3.99. Theorem: Chromatic Recursion. Let G = (V,F) be a simple graph. Fix any 
edge e = {v,w} € G. Let G’ = (V,E ~ {e}) be the simple graph obtained by deleting the 
edge e from G, and let G” be the simple graph obtained from G by collapsing the edge e. 
Then 

xXG(x) = xXq'(Z) — Xer(2). 


Proof. Fix « € N*, and let A, B, and C denote the set of proper colorings of G, G’, and 
G” (respectively) using x available colors. Write B = B, U Bz, where B} ={f € B: f(v) = 
f(w)} and By = {f € B: f(v) 4 f(w)}. Note that B, consists of the proper colorings of 
G’ (if any) in which vertices v and w are assigned the same color. Let z be the new vertex 
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in G” that replaces v and w. Given a proper coloring f € B,, we define a corresponding 
coloring f” of G” by setting f”(z) = f(v) = f(w) and f”(u) = f(u) for all u € V different 
from v and w. Since f is proper, it follows from the definition of the edge set of G” that 
f” is a proper coloring as well. Thus we have a map f +> f” from B, to C. This map is 
invertible, since the color of z in a coloring of G” determines the common color of v and w 
in a coloring of G’ belonging to B,. We conclude that |B,| = |C]. 

On the other hand, Bz consists of the proper colorings of G’ in which vertices v and w 
are assigned different colors. These are precisely the proper colorings of G (since G has an 
edge between v and w, and G is otherwise identical to G’). Thus, By = A. It follows that 


xa() = |A] = |Bo| = |B| — |Bi| = |B] — |C] = xear(@) — xan(@). O 


3.100. Corollary: Polynomiality of Chromatic Functions. For any graph G, x¢(x) 
is a polynomial in x with integer coefficients. (This justifies the terminology chromatic 
polynomial.) 


Proof. We use induction on the number of edges in G. If G has k vertices and no edges, the 
product rule gives xq(x) = x", which is a polynomial in 2. Now assume G has m > 0 edges. 
Fix such an edge e, and define G’ and G” as in the preceding theorem. G’ has one fewer 
edge than G. When passing from G to G”, we lose the edge e and possibly identify other 
edges in G (e.g., if both endpoints of e are adjacent to a third vertex). In any case, G” has 
fewer edges than G. By induction on m, we may assume that both yq@ (x) and xq (x) are 
polynomials in x with integer coefficients. So ye(x) = xq (x) — XE" (2) is also a polynomial 
with integer coefficients. O 


3.101. Remark. We can use the chromatic recursion to compute yg recursively for any 
graph G. The base case of the calculation is a graph with k vertices and no edges, which 
has chromatic polynomial «*. If G has more than one edge, G’ and G” both have strictly 
fewer edges than G. Thus, the recursive calculation will terminate after finitely many steps. 
However, this is quite an inefficient method for computing yg if G has many vertices and 
edges. Thus, direct counting arguments using the sum and product rules may be preferable 
to repeatedly applying the chromatic recursion. 


3.102. Example. Consider the graph G shown on the left in Figure 3.21. We compute 
xvG(x) by applying the chromatic recursion to the edge e = {d,h}. The graphs G’ and G” 
obtained by deleting and collapsing this edge are shown on the right in Figure 3.21. Direct 
arguments using the product rule show that 


Xoar (a) = a(a — 1)(a — 2)(a — 2)(a@ — 1)(a — 1) (color a, c, d, f, b, h); 


xan (x) = a(x — 1)(a% — 2)(a — 2) (x — 2) (color z, a, c, f, b). 


xXa(x) = a(x — 1)(a — 2)? (2 — 1)? — (@ — 2)) = 2° — Sa? + 2604 — 4327 + 862? — 122. 


3.103. Chromatic Polynomials for Cycles. For each n > 3, let C, denote a graph 
consisting of n vertices joined in a cycle. Let C, denote a one-vertex graph, and let C2 
denote a graph with two vertices joined by an edge. Finally, let yn(x) = xc,,(a) be the 
chromatic polynomials for these graphs. We see directly that 


x(a“) = 2, xX2(z) = a(x —1) = 2? —2, x3(x) = x(a — 1)(2 — 2) = 2? — 8a? + 22. 


Counting Problems in Graph Theory 131 


a 
c d h 
a 
delete e t 
d h gq ? 
c 
e 
collapse e 
f b E 
G a 
c Z 
f b 
G” 


FIGURE 3.21 
Using the chromatic recursion. 


Fix n > 3 and fix any edge e in C,. Deleting this edge leaves a graph in which n vertices 
are joined in a line; the chromatic polynomial of such a graph is x(x — 1)"~+. On the other 
hand, collapsing the edge e in C, produces a graph isomorphic to C,_1. The chromatic 
recursion therefore gives 

Xn(x) = x(a — 1)"~* — Xn-1(2). 


Using this recursion to compute y,,(x) for small n suggests the closed formula 


Xn(#) = (w@—1)"+(-1)"@—-1) (mB 2). 


We let the reader prove this formula for y,,(a) by induction, using the chromatic recursion. 


mr 


3.16 Spanning Trees 


This section introduces the notion of a spanning tree for a graph. A recursion resembling 
the chromatic recursion 3.99 will allow us to count the spanning trees for a given graph. 
This will lead to a remarkable formula, called the matrix-tree theorem, that expresses the 
number of spanning trees as a certain determinant. 


3.104. Definition: Subgraphs. Let G = (V,E,¢) and H = (W,F,7) be graphs or di- 
graphs. H is a subgraph of G iff W CV, F C E, and n(f) = e(f) for all f € F. A is an 
induced subgraph of G iff H is a subgraph such that F' consists of all edges in E with both 
endpoints in W. 


3.105. Definition: Spanning Trees. Given a graph G = (V,E,«¢), a spanning tree for 
G is a subgraph H with vertex set V such that H is a tree. Let T(G) be the number of 
spanning trees of G. 
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FIGURE 3.22 
Graph used to illustrate spanning trees. 
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FIGURE 3.23 
Effect of deleting or collapsing an edge. 


3.106. Example. Consider the graph G shown in Figure 3.22. This graph has 31 spanning 
trees, which are specified by the following sets of edges: 


{a,c,d,f}, {b,c,d,f}, {a,cod,g}, {b,c,d,g}, {a,c,d,h}, {b,c,d,h}, 
{c,d,e,f}, {ed,e,g}, {cad,e,h}, f{a,ce,f}, f{a,de,f}, {b,c e, f}, 
{b,d,e,f}, {a,c,e,g}, {a,d,e,g}, {bce g}, {b,d,e,g}, {a,c e,h}, 
{a,d,e,h}, {b,c,e,h}, {b,d,e,h}, fac f,h}, {a,d,f,h}, {b,c f,h}, 
{b, d, f, h}, {a,C,9, h}, {a,d,g, h}, {b, Cig, h}, {b, d, g, h}, {c, d, ae h}, 
{c,d,g, h}. 


We see that even a small graph can have many spanning trees. Thus we seek a systematic 
method for enumerating these trees. 


We are going to derive a recursion involving the quantities 7(G). For this purpose, we 
need to adapt the ideas of deleting an edge and collapsing an edge (see 3.98) from simple 
graphs to general graphs. Since loop edges are never involved in spanning trees, we will only 
consider graphs without loops. Suppose we are given a graph G = (V, FE, e€) and a fixed edge 
z € E with endpoints u,v € V. To delete z from G, we replace E by E’ = E ~ {z} and 
replace € by the restriction of € to E’. To collapse the edge z, we act as follows: (i) delete z 
and any other edges linking wu to v; (ii) replace V by (V ~ {u, v})U {w}, where w is a new 
vertex; (iii) for each edge y € E that has exactly one endpoint in the set {u,v}, modify e(y) 
by replacing this endpoint with the new vertex w. 


3.107. Example. Let G be the graph shown in Figure 3.22. Figure 3.23 displays the graphs 
obtained from G by deleting edge f and collapsing edge f. 
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3.108. Theorem: Spanning Tree Recursion. Let G = (V,£E,¢) be a graph, and let 
z € E bea fixed edge. Let Gop be the graph obtained from G by deleting z. Let G, be the 
graph obtained from G by collapsing z. Then 


7(G) = 7(Go) + 7(G1). 


The initial conditions are: T(G) = 0 if G is not connected, and 7(G) = 1 if G is a tree with 
vertex set V. 


Proof. For every graph K, let Sp(K) be the set of all spanning trees of K, so r(K) = 
|Sp(K)|. Fix the graph G and the edge z. Let X be the set of trees in Sp(G) that do not 
use the edge z, and let Y be the set of trees in Sp(G) that do use the edge z. Sp(G) is 
the disjoint union of X and Y, so 7(G) = |X|+ |Y| by the sum rule. Now, it follows from 
the definition of edge-deletion that the set X is precisely the set Sp(Go), so |X| = 7(Go). 
To complete the proof, we need to show that |Y| = 7(G1). It suffices to define a bijection 
F:Y — Sp(Gi). 

Suppose T € Y is a spanning tree of G that uses the edge z with endpoints u,v. Define 
F(T) to be the graph obtained from T by collapsing the edge z; this graph is a subgraph 
of G,. Let n be the number of vertices of G; then T is a connected graph with n — 1 edges, 
one of which is z. It is routine to check that F(T) is still connected. Furthermore, since T is 
a tree, z is the only edge in T between wu and v. It follows from the definition of collapsing 
that F(T) has exactly n — 2 edges. Since G has n — 1 vertices, it follows that F(T) is a 
spanning tree of G. We see also that the edge set of F(T) is precisely the edge set of T 
with z removed. So far, we have shown that F is a well-defined function mapping Y into 
Sp(G1). 

Next we define a map H : Sp(G) — Y that will be the two-sided inverse of F’. Given 
U € Sp(G1) with edge set E(U), let H(U) be the unique subgraph of G with vertex set V 
and edge set E(U) U {z}. We must check that H(U) does lie in the claimed codomain Y. 
First, H(U) is a subgraph of G with n — 1 edges, one of which is the edge z. Furthermore, 
one may check that H(U) is connected, since walks in U can be expanded using the edge 
z if needed to give walks in H(U). Therefore, H(U) is a spanning tree of G using z, and 
so H(U) € Y. Since F removes z from the edge set while H adds it back, F and H are 
two-sided inverses of each other. Hence both are bijections, and the proof is complete. O 


3.109. Example. Let us use the graphs in Figures 3.22 and 3.23 to illustrate the proof 
of the spanning tree recursion, taking z = f. The graph Go on the left of Figure 3.23 has 
19 spanning trees; they are precisely the trees listed in 3.106 that do not use the edge f. 
Applying F' to each of the remaining 12 spanning trees on the list produces the following 
subgraphs of G, (specified by their edge sets): 


{a,c,d}, {b,c,d}, {c,d,e}, f{a,c,e}, {a,d,e}, {bce}, 
{b,d,e}, {a,c,h}, {a,d,h}, {b,c,h}, {b,d,h}, {c,d,h}. 


These are precisely the spanning trees of Gj. 

Next, we illustrate the calculation of 7(G) using the recursion. We first delete and 
collapse edge f, producing the graphs Gp and G; shown in Figure 3.23. We know that 
7T(G) = 7(Go) +7(G,). Deletion of edge g from Go produces a new graph G2 (Figure 3.24), 
while collapsing g in Go leads to another copy of G1. So far, we have r(G) = 27(G,)+7(G2). 
Continuing to work on G1, we see that deleting (resp. collapsing) edge h leads to the graph 
G3 (resp. G4) in Figure 3.24. On the other hand, deleting h from G» leaves a disconnected 
graph (which can be discarded), while collapsing h from G2 produces another copy of 
G3. Now we have 7(G) = 37(G3) + 27(G4). Deleting edge e from G3 gives a graph that 
has two spanning trees (by inspection), while collapsing e in G3 leads to G4 again. So 
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FIGURE 3.24 
Auxiliary graphs used in the computation of 7(G). 


7(G) = 3(2 + 7(G4)) + 27(G4) = 6 + 57 (G4). Finally, deletion of d from G4 leaves a graph 
with two spanning trees, while collapsing d produces a graph with three spanning trees. 
We conclude that 7(G4) = 5, so r(G) = 6 + 25 = 31, in agreement with the enumeration 
in 3.106. 


Next we extend the preceding discussion to rooted spanning trees in digraphs. 


3.110. Definition: Rooted Spanning Trees. Let G = (V,E,«) be a digraph, and let 
ug € V. A spanning tree of G rooted at vg is a rooted tree T with root vg and vertex set 
V such that T (without the loop at vo) is a subgraph of G. Let 7(G, vo) be the number of 
spanning trees of G rooted at vp. 


The notions of edge deletion and contraction extend in a natural way to digraphs. This 
given, we have the following recursion for counting rooted spanning trees. 


3.111. Theorem: Rooted Spanning Tree Recursion. Let vg be a fixed vertex in a 
digraph G, and let z be a fixed edge leading into vo. Let G1 be the digraph obtained from 
G by deleting z. Let G2 be the digraph obtained from G by collapsing z, and let the new 
“collapsed” vertex in G2 be vp. Then 


7(G, v0) = T(G1, v0) + T(Ga, vO). 


Proof. We modify the proof of 3.108. As before, the two terms on the right side count rooted 
spanning trees of G that do not contain z or do contain z. The reader should check that if T 
is a rooted spanning tree using the edge z, then the graph obtained from T by collapsing z 
is a rooted spanning tree of Gy rooted at ug. Similarly, adding z to the edge set of a rooted 
spanning tree of Gz rooted at vg produces a rooted spanning tree of G rooted at vo. O 


3.112. Remark. Our results for counting (undirected) spanning trees are special cases 
of the corresponding results for rooted spanning trees. For, given a graph G = (V,E£,¢), 
consider the associated digraph obtained by replacing each e € E by two directed edges 
going in opposite directions. Arguing as in 3.74, we see that there is a bijection between the 
set of rooted spanning trees of this digraph rooted at any given vertex vp € V and the set 
of spanning trees of G. In the sequel, we shall only treat the case of digraphs. 


DT 


3.17 Matrix-Tree Theorem 


There is a remarkable determinant formula for the number of rooted spanning trees of a 
digraph. The formula uses the following modified version of the adjacency matrix of the 
digraph. 
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FIGURE 3.25 
Digraph used to illustrate the matrix-tree theorem. 


3.113. Definition: Laplacian Matrix of a Digraph. Let G be a loopless digraph on 
the vertex set V = {vo,U1,-.., Un}. The Laplacian matrix of G is the matrix L = (Lj; :0< 
i,j <n) such that L,; = outdeg(v;) and L;; is the negative of the number of edges from v; 
to v; in G. We let Lo be the n x n matrix obtained by erasing the row and column of L 
corresponding to vp. The matrix Lo = Lo(G) is called the truncated Laplacian matrix of G 
(relative to vo). 


3.114. Matrix-Tree Theorem. With the notation of the preceding definition, we have 
T(G, v9) = det(Lo(G)). 
We prove the theorem after considering two examples. 


3.115. Example. Let G be the digraph associated to the undirected graph in Figure 3.22. 
In this case, L,; is the degree of vertex i in the undirected graph, and L;; is minus the 
number of undirected edges between 7 and 7. So 


4 0 -2 -1 -1 

0 3 O -2 -1 

L=;]}-2 0 3 0 -!1 

-1 -2 0 3 0 

-1 -1 -1 0 3 

Striking out the row and column corresponding to vertex 0 leaves 

3. 30° 22°31 
0 3 0 1 


—2 0 3 0 
-1 -1 0O 3 


Lo = 


We compute det(Zo) = 31, which agrees with our earlier calculation of 7(G). 


3.116. Example. Consider the digraph G shown in Figure 3.25. We compute 


4 0-1 0 -2 -1 a ae ae 
-1 2 0 -1 0 O 

-10 2 0 -1 #0 VR ae an 

L= , &=}]0 0 1 0 Of, 
=v 7 TA A> O 

OO: 8; cA 8 

me 8 AQ? rear a o6G) eae ee eer 
0.0) Ais OS 2 A 
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and det(Lo) = 16. So G has 16 spanning trees rooted at 0, as one may confirm by direct 
enumeration. We will use the matrix Lo as a running example in the proof below. 


3.117. Proof of the Matrix-Tree Theorem. Write Lo = Lo(G). First we prove that 
T(G, vo) = det(Lo) in the case where indeg(vp) = 0. If vp is the only vertex of G, then 
7(G, v9) = 1 and det(LZo) = 1 by the convention that the determinant of a 0 x 0 matrix is 1. 
Otherwise, 7(G, vo) is zero, and Lo is a nonempty matrix. Using the condition indeg(vp) = 0 
and the definition of Lo, one sees that every row of Lo sums to zero. Therefore, letting w 
be a column vector of n ones, we have Loti = 0, so that Lo is singular and det(Lo) = 0. 
For the general case, we argue by induction on the number of edges in G. The case 
where G has no edges is covered by the previous paragraph. The only case left to consider 
occurs when indeg(vo) > 0. Let e be a fixed edge in G that leads from some v; to vo. Let 
G, be the graph obtained from G by deleting e, and let Gz be the graph obtained from G 
by collapsing e. Both graphs have fewer edges than G, so the induction hypothesis tells us 
that 
7(Gi, Uo) = det(Lo(G1)) and T(G2, v6) = det(Lo(G2)), (3.4) 


where vj is the new vertex created after collapsing e. Using 3.111, we conclude that 
7(G, v9) = det(Lo(Gi)) + det(Lo(G2)). (3.5) 


Next, let us evaluate the determinant det(Lo(G)). We will use the fact that the deter- 
minant of a matrix is a linear function of each row of the matrix. More precisely, for a 
fixed matrix A and row index i, let Aly] denote the matrix A with the ith row replaced 
by the row vector y; then det(Al[y + z]) = det(A[y]) + det(A[z]) for all y, z. This linearity 
property can be proved directly from the definition of the determinant (see 9.37 and 9.45 
below). To apply this result, write the ith row of Zo = Lo(G) in the form y + z, where 
z = (0,0,...,1,0,...,0) has a one in position i. Then 


det(Lo(G)) = det(Lo[y]) + det(Lo[z}). (3.6) 


For example, if G is the digraph in Figure 3.25 and e is the edge from 2 to 0 (so i = 2), 
then y = (0,1,0,—1,0), z = (0,1, 0,0, 0), 


Oe Oh, ats 10 20 2 0 10-0 
O fy Oo 1, 0 0 1 0 00 
Loyj=|0 0 1 0 Of, Lofe]J=]0 0 1 00 
Of 60.8 A oD 0 0 -3 4 0 
OF). 0 a Oh sO" a: 


Comparing equations (3.5) and (3.6), we see that it suffices to prove det(Lo(Gi)) = 
det (Lo[y]) and det(Lo(G2)) = det (Lo[z]). 

How does the removal of e from G affect L(G)? Answer: The i, i-entry drops by 1, while 
the 2, 0-entry increases by 1. Since the zeroth column is ignored in the truncated Laplacian, 
we see that we can obtain Lo(G1) from Lo(G) by decrementing the i, i-entry by 1. In other 
words, Lo(Gi) = Loly], and hence det(Lo(G1)) = det(Lol[y}). 

Next, let us calculate det(Zo[z]) by expanding the determinant along row 7. The only 
nonzero entry in this row is the 1 in the diagonal position, so det(Lo[z]) = (—1)*** det(M) = 
det(M), where M is the matrix obtained from Lo[z] (or equivalently, from Lo) by erasing 
row 7 and column 2. In our running example, 


2 -1 0 0 
0 1 0 0 
Mee 0 -3 4 0 
0 0 01 
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We claim that M = Lo(G2), which will complete the proof. Consider the k,j-entry of 
M, where k,j € {0,1,...,n} ~ {0,i}. If k = j, this entry is outdegg(v;), which equals 
outdegg,(v;) because v; is not vp, vj, or vg. For the same reason, if k 4 j, the k, j-entry of 
M is minus the number of edges from vz to v;, which is the same in G and Go. 


DT 


3.18 Eulerian Tours 


3.118. Definition: Eulerian Tours. Let G = (V, E,¢) bea digraph. An Eulerian tour in 
G is a walk W = (vo, €1, U1, €2, ¥2,---,€n; Un) such that W visits every vertex in V, and W 
uses every edge in F exactly once. Such a tour is called closed iff up, = vo. 


3.119. Example. Consider the digraph G shown in Figure 3.26. Here is one closed Eulerian 
tour of G: 


Wi = (0, m, 2,1,5,e,1,a,3,c, 4, b,3,d,5, f,4, 9,5, k, 0,7, 4, h, 5, 7,0). 


To specify the tour, it suffices to list only the edges in the tour. For instance, here is the 
edge sequence of another closed Eulerian tour of G: 


W2 — (4,0) 6, Gd, FO), 1, 9, Wig, kh). 


3.120. Example. Consider the digraph G shown in Figure 3.2. This graph does not have 
any closed Eulerian tours, since there is no way to reach vertex 6 from the other vertices. 
Even if we delete vertex 6 from the graph, there will still be no closed Eulerian tours. For, 
there is no way that a tour can use both edges leaving vertex 2, since only one edge enters 
vertex 2. 


The previous example indicates two necessary conditions for a digraph to have a closed 
Eulerian tour: the digraph must be connected, and also balanced in the sense that indeg(v) = 
outdeg(v) for every vertex v. We now show that these necessary conditions are also sufficient 
to guarantee the existence of a closed Eulerian tour. 


3.121. Theorem: Existence of Closed Eulerian Tours. A digraph G = (V, E,¢) has 
a closed Eulerian tour iff G is connected and balanced. 


FIGURE 3.26 
Digraph used to illustrate Eulerian tours. 
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Proof. First suppose G has a closed Eulerian tour W starting at vo. Since W visits every 
vertex, we can obtain a walk from any vertex to any other vertex by following suitable edges 
of W. So G is connected. Next, let v be any vertex of G. The walk W arrives at v via an 
incoming edge exactly as often as the walk leaves v via an outgoing edge; this is true even 
if v = vp. Since the walk uses every edge exactly once, it follows that indeg(v) = outdeg(v). 

Conversely, assume that G is connected and balanced. Let W = (vo, €1, U1,---,@n; Un) 
be a walk of maximum length in G that never repeats an edge. We claim that v, = vo. For, 
if not, W enters vertex v, one more time than it leaves v,,. Since indeg(v,) = outdeg(v,), 
there must be an outgoing edge from v, that has not been used by W. So we could use 
this edge to extend W, contradicting maximality. Next, we claim that W uses every edge 
of G. If not, let e be an edge not used by W. Since G is connected, we can find such an 
edge that is incident to one of the vertices vu; visited by W. Since v, = vg, we can cyclically 
shift the walk W to get a new walk W’ = (uj, €j41, Viti,---; Cn; Un = V0, €1,---5 i; Vi) that 
starts and ends at v;. By adding the edge e to the beginning or end of this walk (depending 
on its direction), we could again produce a longer walk than W with no repeated edges, 
violating maximality. Finally, W must visit every vertex of G, since W uses every edge of 
G and (unless G has one vertex and no edges) every vertex has an edge leaving it. oO 


Our goal in the rest of this section is to prove the following formula for the number of 
closed Eulerian tours in G starting at a given vertex vo. Recall that T(G, vo) is the number 
of rooted spanning trees of G rooted at vo. 


3.122. Theorem: Counting Eulerian Tours. Let G = (V, E,) be a connected, balanced 
digraph. For each v9 € V, the number of closed Eulerian tours of G starting at vo is 


T(G, v9) - outdeg(vo)! - II (outdeg(v) — 1)!. (3.7) 
vAVvo 


Let {vo,v1,---,Un} be the vertex set of G. Let X be the set of all closed Eulerian tours 
of G starting at vp. Let SpTr(G, vo) be the set of spanning trees of G rooted at vp. Let Y 
be the set of all tuples (T, wo, wi, w2,...,Wn) where: T € SpTr(G, vo); wo is a permutation 
of all the edges leaving vp; and, for 1 <i <n, w; is a permutation of those edges leaving v; 
other than the unique outgoing edge from v; that belongs to T (see 3.42). By the product 
rule, the cardinality of Y is given by the right side of (3.7). So it will suffice to define a 
bijection f: X 3 Y. 

Given an Eulerian tour W € X, define f(W) = (T,wo,...,Wn) as follows. For each 7 
between 0 and n, let wi be the permutation of all edges leading out of v;, taken in the 
order in which they occur in the walk W. Call w; the departure word of vertex v;. Next, set 
wo = wo and for i > 0, let w; be the word w} with the last symbol erased. Finally, let T 


be the subgraph of G whose edges are given by the last symbols of w},...,w/,, augmented 
by a loop edge at vo. It is not immediately evident that T € SpTr(G, vo); we will prove this 
shortly. 


Next we define a map g : Y — X that will be the two-sided inverse of f. Fix 
(T, wo,---;Wn) € Y. For every i > 0, form wi by appending the unique edge of T leav- 
ing v; to the end of the word w;; let w) = wo. Starting at vp, we use the words wi} to build a 
walk through G, one edge at a time, as follows. If we are currently at some vertex v;, use the 
next unread symbol in wi to determine which edge to follow out of v;. Repeat this process 
until the walk reaches a vertex in which all the outgoing edges have already been used. The 
resulting walk W is g(T,wo,...,Wn). The edges occurring in W are pairwise distinct, but 
it is not immediately evident that W must use all edges of G; we will prove this shortly. 

Once we check that f and g map into their stated codomains, the definitions just given 
show that fog and go f are both identity maps. Before proving that f maps into Y and g 
maps into X, let us consider an example. 


Counting Problems in Graph Theory 139 


4 0 3 4 0 
e Ge 
f or ees 
h h 
a 5 a k 
1 5 1 ~ 1 5 1 - 
1; T2 


FIGURE 3.27 
Rooted spanning trees associated to Eulerian tours. 


3.123. Example. We continue the analysis of Eulerian tours in the digraph G from 3.119. 
The walk W, in that example has departure words wi = mi, wi = a, wh = 1, ws = cd 
wy = bgh, and w, = efkj. Therefore, 


o] 


f(W1) = (T, mi, SC, bg, efk), 


where - denotes an empty word and T; is the graph shown on the left in Figure 3.27. 
Similarly, for W2 we compute wi = im, wi = a, wh = 1, ws = dc, w = gbh, wh = ef jk, 
and 


f(W2) = (To, im, " “d, gb, efj). 


Let us now calculate g((T1,im,-,-,c, bg, fke)). First, we use the edges of T; to recreate the 
departure words wi = im, wi = a, wh = 1, ws = cd, wi = bgh, and w5 = fkej. We then 
use these words to guide our tour through the graph. We begin with 0,7,4, since 7 is the 
first letter of w). Consulting w/, next, we follow edge b to vertex 3, then edge c to vertex 4, 
then edge g to vertex 5, and so on. We obtain the tour 


W3 = (0,2, 4, b,3,¢,4,9,5, f,4, h,5,k,0,m, 2,1,5,e, 1, a,3, d, 5, j, 0). 
Similarly, the reader may check that 
g((T2, mi, 5 -d, bg, j fe)) = (r,t, 954) By, f,95€,4,¢, h,k). 


To complete the proof of 3.122, we must prove two things. First, to show that f(W) © Y 
for all W € X, we must show that the digraph T obtained from the last letters of the 
departure words w% (i > 0) is a rooted spanning tree of G rooted at vp. Since W visits every 
vertex of G, the definition of T shows that outdeg;(v;) = 1 for all i > 0. We need only 
show that T has no cycles other than the loop at vo (see 3.42). We can view the tour W as 
a certain permutation of all the edges in G. Let us show that if e,h are two non-loop edges 
in T with e(e) = (x,y) and e(h) = (y,z), then e must precede A in the permutation W. 
Note that y cannot be vo, since the only outgoing edge from vo in T is a loop edge. Thus, 
when the tour W uses the edge e to enter y, the following edge in the tour exists and is 
an outgoing edge from y. Since h is, by definition, the last such edge used by the tour, e 
must precede h in the tour. Now suppose (Zo, €1, 21,---;@n;2n) is a cycle in T that is not 
the 1-cycle at vp. Using the previous remark repeatedly, we see that e; precedes e;41 in W 
for all i, and also e, precedes e; in W. These statements imply that e; precedes itself in 
W, which is absurd. We conclude that f(W) € Y. 
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Second, we must show that g maps Y into X. Fix (T,wo,...,Wn) € Y and W = 
g(T, wo,..-,;Wn), and let w; be the departure words constructed from T and the w;’s. We 
know from the definition of g that W is a walk in G starting at vo that never repeats an 
edge. We must show that W ends at vp and uses every edge in G. Suppose, at some stage in 
the construction of W, that W has just reached v; for some i > 0. Then W has entered v; 
one more time than it has left vj. Since G is balanced, there must exist an unused outgoing 
edge from v;. This edge corresponds to an unused letter in wi. So W does not end at 1. 
The only possibility is that W ends at the starting vertex vo. 

To prove that W uses every edge of G, we claim that it is enough to prove that W uses 
every non-loop edge of T. For, consider a vertex vu £ vo of G. If W uses the unique outgoing 
edge from v that is part of T, then W must have previously used all other outgoing edges 
from v, by definition of W. Since W ends at vo, W certainly uses all outgoing edges from 
uo. All edges are accounted for in this way, proving the claim. 

Finally, to get a contradiction, assume that some edge e in T from z to y is not used 
by W. Since T is a rooted tree rooted at v9, we can choose such an e so that the distance 
from y to vo through edges in T' is minimal. If y 4 vo, minimality implies that the unique 
edge leading out of y in T does belong to W. Then, as noted in the last paragraph, every 
outgoing edge from y in G is used in W. Since G is balanced, every incoming edge into y in 
G must also appear in W, contradicting the assumption that e is not used by W. On the 
other hand, if y = vo, we see similarly that W uses every outgoing edge from y in G and 
hence every incoming edge to y in G. Again, this contradicts the assumption that e is not 
in W. This completes the proof of 3.122. 


Summary 
Table 3.1 contains brief definitions of the terminology from graph theory used in this chapter. 
e Facts about Matrix Multiplication. If A,,...,A, are matrices such that A; is ny_1 x Nn, 


then the 7, j-entry of the product A, A 9--- A, is 


Ns-1 


3 se ae oe Ai (i,k) Ao(k1, k2)A3 (ke, k3)---As(ks—1,J)- 


ki=1 k2=1 ks-1=1 
If A = 0 (ie., A is nilpotent), then J — A is invertible, and 
(I-A) 1 =I+A+A?4+ A? 4+---+ AS, 


This formula applies (with s = n) when A is a strictly upper or lower triangular n x n 
matrix. 


e Adjacency Matrices and Walks. Given a graph or digraph G with vertex set {v1,...,Un}, 
the adjacency matrix of G is the matrix A such that A(z, 7) is the number of edges from 
vu; to v; in G. For all s > 0, A*(i,7) is the number of walks in G of length s from vu; 
to v;. G is a DAG iff A” = 0, in which case A will be strictly lower-triangular under a 
suitable ordering of the vertices. When G is a DAG, (I — A)~1(i, 7) is the total number 
of paths (or walks) from vu; to v;. 


e Degree-Sum Formulas. For a digraph G = (V, E,«), 
s indegg(v) = |E| = S- outdega (vu 


vEV vEV 
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TABLE 3.1 
Terminology used in graph theory. 


Brief Definition 


graph V, E,e€) where e(e v,w} means edge e has endpoints v, w 
digraph (V, E,€) where e(e) = (v, w) means edge e goes from v to w 
simple graph graph with no loops or multiple edges 

simple digraph digraph with no multiple edges 

G2aH G becomes H under suitable renaming of vertices and edges 
walk (V0, €1,U1,---;€s,Us) where each e; is an edge from vj4_1 to v; 
closed walk walk starts and ends at same vertex 

path walk visiting distinct vertices 

cycle closed walk visiting distinct vertices and edges, except at end 
DAG digraph with no cycles 

indegg(v) number of edges leading to v in digraph G 

outdega(v) number of edges leading from v in digraph G 

dega(v) number of edges incident to v in graph G (loops count as 2) 
isolated vertex vertex of degree zero 

leaf vertex of degree one 

functional digraph simple digraph with outdeg(v) = 1 for all vertices v 

cyclic vertex vertex in functional digraph that belongs to a cycle 

rooted tree functional digraph with a unique cyclic vertex (the root) 

G is connected for all u,v € V(G), there is a walk in G from u to v 

cut-edge of G edge belonging to no cycle of the graph G 

forest graph with no cycles 

acyclic graph graph with no cycles 

tree connected graph with no cycles 

proper coloring map f : V(G) — C assigning unequal colors to adjacent vertices 
XG(2) number of proper colorings of G using x available colors 
chromatic number least « with yg(x) > 0 

subgraph of G graph G’ with V(G’) C V(G), E(G’) C E(G) (same endpoints) 


induced subgraph subgraph G’ where all edges in G with ends in V(G’) are kept 
spanning tree of G subgraph of G that is a tree using all vertices 


T(G) number of spanning trees of G 
rooted spanning tree | rooted tree using all vertices of a digraph 
T(G, v0) number of rooted spanning trees of G with root vo 


Eulerian tour walk visiting each vertex that uses every edge once 
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For a graph G = (V, E,€), 
J degg(v) = 212. 


vEeV 


e Functional Digraphs. For a finite set X, every function f : X — X has an associated 
functional digraph with vertex set X and edge set {(x, f(a)) : 2 € X}. Every functional 
digraph decomposes uniquely into one or more disjoint cycles together with disjoint 
rooted trees rooted at the vertices on these cycles. For each vertex xg in a functional 
digraph, there exist unique walks of each length k starting at xo, which are found 
by repeatedly following the unique outgoing edge from the current vertex. Such walks 
eventually reach a cycle in the functional digraph. 


e Cycle Structure of Permutations. For X finite, a map f : X — X is a bijection iff 
the functional digraph of f is a disjoint union of directed cycles. The signless Stirling 
number of the first kind, s’(n,k), counts the number of bijections f on an n-element set 
such that the functional digraph of f has k cycles. We have 


s'(n,k) = 8'(n —1,k —1) + (n—1)s'(n —1,k) (0<k <n). 


e Connectedness and Components. The vertex set of any graph or digraph G is the dis- 
joint union of connected components. Two vertices belong to the same component iff 
each vertex is reachable from the other by a walk. G is connected iff there is only one 
component iff for all u,v € V(G) there exists at least one path from u to v in G. Deleting 
a cut-edge splits a component of G in two, whereas deleting a non-cut-edge has no effect 
on components. 


e Forests. A graph G is a forest (acyclic) iff G has no loops and for each u,v € V(G), 
there is at most one path from u to v. A forest with n vertices and k edges has n — k 
components. 


e Trees. The following conditions on an n-vertex simple graph G are equivalent and char- 
acterize trees: (a) G is connected with no cycles; (b) G is connected with at most n— 1 
edges; (c) G is acyclic with at least n — 1 edges; (d) for all u,v € V(G), there exists a 
unique path in G from u to v. An n-vertex tree has n — 1 edges and (for n > 1) at least 
two leaves. Pruning any leaf from a tree gives another tree with one less vertex and one 


less edge. 
e Tree Enumeration Results. There are n”~? trees with vertex set {1,2,...,n}. There are 
n”~2 rooted trees on this vertex set rooted at 1. For d; +---+dn = 2(n —1), there 


are (wee) trees on this vertex set with deg(j) = d; for all 7. Bijective proofs of 


these facts use the following ideas: 


— Functions on {1,2,...,n} fixing 1 and n correspond to rooted trees by arranging 
the cycles of the functional digraph in a certain order, breaking “back edges,” and 
linking the cycles to get a tree (see Figures 3.7 and 3.8). 


— Trees correspond to rooted trees by directing each edge of the tree towards the 
desired root vertex. 


— Trees with deg(j) = d; correspond to words in R(1“~1!---n’~!) by repeatedly 
pruning the largest leaf and appending the leaf’s neighbor to the end of the word. 


Counting Problems in Graph Theory 143 


Terms and Ordered Trees. For every term T, there exist a unique integer n > 0 and 
unique terms J7),...,Z;,, such that T = n7,J7>---T,. A word w,---w, is a term iff 
wy +--+ +w;—t> 0 for alli < s and w; +-:-+w,—s = —1. No proper prefix of a 
term is a term. Terms correspond bijectively to ordered trees. 


Lists of Terms and Ordered Forests. Every list of terms has the form T;---T,, for some 
unique integer n > 0 and unique terms T),...,7,. A word w,--- weg is a list of n terms 
iff wy +--+: +uw;—i7 > —n for alli < s and w}4+-:-+wuw.,—s = —n. Lists of terms 
correspond bijectively to ordered forests. 


Cycle Lemma and Enumeration of Lists of Terms. For a word w = wy 1---ws with 
wi +---+ws —s =—n, there exist exactly n cyclic shifts of w that are lists of n terms. 
Consequently, the number of lists of n terms using k; copies of 7 (for 0 <i < ft) is 


n s 
Ss ko, ki,...,ks j 


where s = 2 k, and kj =n+ pyar — ky. 


Chromatic Polynomials. For any edge e in a simple graph G, the chromatic function 
of G satisfies the recursion yg = XG~{e} — XG., Where G ~ {e} is G with e deleted, 
and G is G with e collapsed. It follows that .¢(x) is a polynomial function of 7. The 
signed Stirling numbers of the first kind, s(n,k), are the coefficients in the chromatic 
polynomial for an n-vertex graph with an edge between each pair of vertices. 


Spanning Tree Recursion. For any edge e in a graph G, the number 7(G) of spanning 
trees of G satisfies the recursion T(G) = 7(G ~ {e}) +7(G.), where G ~ {e} is G with 
e deleted, and G, is G with e collapsed. A similar recursion holds for rooted spanning 
trees of a digraph. 


Matriz-Tree Theorem. Given a digraph G and v9 € V(G), let Li; = outdege(v;), let 
—Lj; be the number of edges from i to 7 in G, and let Lo be the matrix obtained from 
(Lij) by erasing the row and column indexed by vo. Then det(Zo) is the number 7(G, vo) 
of rooted spanning trees of G with root vo. 


Eulerian Tours. A digraph G has a closed Eulerian tour iff G is connected and balanced 
(indegree equals outdegree at every vertex). In this case, the number of such tours 
starting at vo is 


7(G, vo) - outdege(vo)!- II (outdega(v) — 1)!. 
v#Vvo 


The proof associates to each tour a rooted spanning tree built from the last departure 
edge from each vertex, together with (truncated) departure words for each vertex giving 
the order in which the tour used the other outgoing edges. 


Exercises 


3.124. Draw pictures of the following simple graphs, which have the indicated nicknames. 
(a) the claw C = ({1, 2,3, 4}, {{1, 2}, {1, 3}, {1, 4}}); 
(b) the paw P = ({1,2,3, 4}, {{1, 2}, {1,3}, {1,4}, {2,3}}); 
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d) the bull B = ({1, 2,3, 4,5}, {{1, 2}, {2, 3}, {1, 3}, {1, 4}, {2, 5}}). 

e) the n-path P,, = ({1,2,...,n}, {{4,¢+ 1}: 1<i<n}). 

f) the n-cycle C,, = ({1,2,...,n}, {{¢,24+1}:1<i<n}U{{1,n}}), where n > 3. 

(g) the complete graph Ky, = ({1,2,...,n}, {{i,j} : 1 <i< 7 <n}). 

3.125. Let V be an n-element set. (a) How many simple graphs have vertex set V? (b) 
How many simple digraphs have vertex set V? 


3.126. Let V and E be sets with |V| = n and |E| = m. (a) How many digraphs have vertex 
set V and edge set F'? (b) How many graphs have vertex set V and edge set E? 


3.127. Let V be an n-element set. Define a bijection between the set of simple graphs with 
vertex set V and the set of symmetric, irreflexive binary relations on V. Conclude that 
simple graphs can be viewed as certain kinds of simple digraphs. 


3.128. Let G, H, and K be graphs (resp. digraphs). (a) Prove G = G. (b) Prove G = H 
implies H © G. (c) Prove G © H and H & K imply G & K. Thus, graph isomorphism is 
an equivalence relation on any given set of graphs (resp. digraphs). 


3.129. Find all isomorphism classes of simple graphs with at most four vertices. 
3.130. Find the adjacency matrices for the graphs in 3.124. 


3.131. Let G be the simple graph in Figure 3.10. For 1 < k < 8, find the number of walks 
in G from vertex 1 to vertex 10. 


3.132. Let G be the graph in Figure 3.22. Find the number of walks in G of length 5 
between each pair of vertices. 


3.133. Let G be the digraph in Figure 3.25. Find the number of closed walks in G of length 
10 that begin at vertex 0. 


3.134. (a) Show that a graph G with a closed walk of odd length must have a cycle of odd 
length. (b) If G has a closed walk of even length, must G have a cycle? 


3.135. Let G be a graph with adjacency matrix A. (a) Find a formula for the number of 
paths in G of length 2 from v; to v;. (b) Find a formula for the number of paths in G of 
length 3 from v; to v;. 
3.136. Consider the DAG G shown here. 

q) 


‘A 


(a) Find all total orderings of the vertices for which the adjacency matrix of G is strictly 
lower-triangular. (b) How many paths in G go from vertex 5 to vertex 1? 


3.137. An irreflexive, transitive binary relation on a set X is called a strict partial order 
on X. Given a strict partial order R on a finite set X, show that the simple digraph (X, R) 
isa DAG. 
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3.138. For each of the following sets X and strict partial orders R, draw the associated 
DAG (see 3.137) and calculate the number of paths from the smallest element to the largest 
element of the partially ordered set. 

(a) X = {1,2,3,4,5} under the ordering 1<2<3<4<5. 

(b) X = P({1, 2,3}), and (S,T) € Riff SCT. 

(c) X is the set of positive divisors of 60, and (a,b) € R iff a < b and a divides b. 


3.139. Let X = {1,2,...,n} ordered by 1 < 2 <--- <n. In the associated DAG (see 3.137), 
how many paths go from 1 to n? Can you find a combinatorial (not algebraic) proof of your 
answer? 


3.140. Let X be the set of subsets of {1,2,...,} ordered by (strict) set inclusion. In the 
associated DAG (see 3.137), how many paths go from @ to {1,2,...,n}? 


3.141. Given a digraph G, construct a simple digraph H as follows. The vertices of H are 
the strong components of G. Given C,D € V(H) with C 4 D, there is an edge from C to 
D in #7 iff there exists c € C and d € D such that there is an edge from c to d in G. (a) 
Prove that H is a DAG. (b) Conclude that some strong component C' of G has no incoming 
edges from outside C’, and some strong component D has no outgoing edges. (c) Draw the 
DAGs associated to the digraph G3 in Figure 3.1 and the functional digraph in Figure 3.5. 


3.142. (a) Find the degree sequence for the graph in Figure 3.10, and verify 3.34 in this 
case. (b) Compute the indegrees and outdegrees at each vertex of the digraph in Figure 3.25, 
and verify 3.31 in this case. 


3.143. Find necessary and sufficient conditions for a multiset [d1,d2,...,dn] to be the 
degree sequence of a graph G. 


3.144. Consider the cycle graph C;, defined in 3.124. (a) What is deg(C,,)? (b) Show that 
any connected graph with the degree sequence in (a) must be isomorphic to C,. (c) How 
many graphs with vertex set {1,2,...,} are isomorphic to C,,? (d) How many isomorphism 
classes of graphs have the same degree sequence as C,,? (e) How many isomorphism classes 
of simple graphs have the same degree sequence as C,,? 


3.145. Consider the path graph P,, defined in 3.124. (a) What is deg(P,)? (b) Show that 
any connected graph with the degree sequence in (a) must be isomorphic to P,. (c) How 
many graphs with vertex set {1,2,...,n} are isomorphic to P,,? (d) How many isomorphism 
classes of graphs have the same degree sequence as P,,? 


3.146. Find two simple graphs G and H with the smallest possible number of vertices, 
such that deg(G) = deg(H) but G # H. 


3.147. Prove or disprove: there exists a simple graph G with more than one vertex such 
that the degree sequence deg(G) contains no repetitions. 


3.148. Prove or disprove: there exists a graph G with no loops and more than one vertex 
such that the degree sequence deg(G) contains no repetitions. 


3.149. Given a graph G = (V, E,«), we can encode the endpoint function € by a |V| x |E| 
matrix M, with rows indexed by V and columns indexed by E, such that M(v,e) is 2 if 
e is a loop edge at v, 1 if e is a non-loop edge incident to v, and 0 otherwise. M is called 
the incidence matrix of G. Prove the degree-sum formula 3.34 by computing the sum of all 
entries of M in two ways. 
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3.150. Draw the functional digraphs associated to each of the following functions f : 
X — X. For each digraph, find the set C of cyclic vertices and the set partition {S, : 
uv € C} described in 3.43. (a) X = {1,2,3,4}, f is the identity map on X; (b) X = 
{0,1,...,6}, f(z) = (x? +1) mod 7; (c) X = {0,1,...,12}, f(z) = (x? + 1) mod 13; (d) 
X = {0,1,...,10}, f(x) = 32 mod 11; (e) X = {0,1,..., 11}, f(v) = 4a mod 12. 


3.151. Let X = {0,1,2,...,9}. (a) Define f : X — X by setting f(x) = (8a +7) mod 10. 
Draw the functional digraphs for f, f~! and f o f. What is the smallest integer k > 0 such 
that fo fo---of (k factors) is the identity map on X? (b) Define g: X — X by setting 
g(x) = (2a + 3) mod 10. Draw the functional digraphs for g and go g. 


3.152. Let X be a finite set, let ro € X, and let f : X — X be any function. Recursively 
define 41 = f(@m) for all m > 0. Show that there exists i > 0 with x; = x9;. 


3.153. Pollard-rho Factoring Algorithm. Suppose N > 1 is an integer. Let X = 
{0,1,...,N — 1}, and define f : X — X by f(x) = (a7 +1) mod N. (a) Show that the 
following algorithm always terminates and returns a divisor of N greater than 1. (Use 3.152.) 


Step 1. Set u = f(0), v = f(f(0)), and d = gcd(v — u, N). 
Step 2. While d= 1: set u = f(u), v= f(f(v)), and d= gcd(v — u, N). 
Step 3. Return d. 


(b) Trace the steps taken by this algorithm to factor N = 77 and N = 527. 


3.154. Suppose X is a finite set of size k and f : X — X is a random function (so for all 
x,y € X, P(f(x) = y) = 1/k, and these events are independent for different choices of x). 
Let xo € X, define tm41 = f(@m) for all m > 0, and let S be the least index such that 
“xg = x, for some t < S. (a) For each s > 0, find the exact probability that S > s. (b) Argue 
informally that the expected value of S$ is at most 2k. (c) Use (b) to argue informally that 
the expected number of gcd computations needed by the Pollard-rho factoring algorithm to 
find a divisor of a composite number N (see 3.153) is bounded above by 2N'/4. 


3.155. Let V be an n-element set, and let vo ¢ V. A function f : V — V is called acyclic iff 
all cycles in the functional digraph of f have length 1. Count these functions by setting up 
a bijection between the set of acyclic functions on V and the set of rooted trees on V U {v9 } 
with root vo. 


3.156. How many bijections f on an 8-element set are such that the functional digraph of 
f has (a) five cycles; (b) three cycles; (c) one cycle? 


3.157. Let X be an n-element set. Let Y be the set of all functional digraphs for bijections 
f :X — X. How many equivalence classes does Y have under the equivalence relation of 
graph isomorphism (see 3.128)? 


3.158. How many functional digraphs with vertex set {1,2,...,n} have a, cycles of length 
1, ae cycles of length 2, etc., where 5°, ia; =n? 


3.159. Referring to the proof of 3.47, draw pictures of the set A of functions, the set B of 
trees, and the bijection ¢: A — B when n = 4. 


3.160. Compute the rooted tree associated to the function below by the map ¢ in the proof 
of 3.47. 
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3.161. Compute the function associated to the rooted tree with edge set 
{(1, 1), (2, 12), (3, 1), (4,3), (5, 10), (6, 17), (7, 15), (8, 7), (9, 3), 
(10, 3), (11, 12), (12, 1), (13, 4), (14, 10), (15, 1), (16, 4), (17, 4)} 
by the map ¢~! in the proof of 3.47. 


3.162. Formulate a theorem for rooted trees similar to 3.75, and prove it by analyzing the 
bijection in 3.47. 


3.163. Let G be the digraph in Figure 3.2. Use the algorithm in 3.52 to convert the walk 
W = (1,0,1,0,1,4,3, f,5,m,2,n,5,h,4,¢,3, f,5, 9,4, 9,5, m, 2, &, 4) 

to a path in G from 1 to 4. 

3.164. What are the strong components of a functional digraph? 


3.165. Show that a connected graph G with n vertices has n edges iff G has exactly one 
cycle. 


3.166. Prove that a graph G is not connected iff there exists an ordering of the vertices of 
G for which the adjacency matrix of G is block-diagonal with at least two diagonal blocks. 


3.167. Prove 3.60 using 3.58, and again without using 3.58. 
3.168. How many connected simple graphs have vertex set {1, 2,3, 4}? 


3.169. How many connected simple graphs on the vertex set {1, 2,3, 4,5} have exactly five 
edges? 


3.170. Bipartite Graphs. A graph G is called bipartite iff there exist two sets A and B 
(called partite sets for G) such that AN B = 6, AUB = V(G), and every edge of G has one 
endpoint in A and one endpoint in B. (a) Prove that a bipartite graph G has no cycle of 
odd length. (b) Prove that a graph G with no odd-length cycles is bipartite by considering, 
for each component C' of G, the length of the shortest path from a fixed vertex v9 € C' to 
the other vertices in C' (cf. 3.134). (c) Prove that a graph G with no odd-length cycles is 
bipartite by induction on the number of edges in G. 


3.171. How many bipartite simple graphs have partite sets A = {1,2,...,m} and B = 
{m+1,...,m+n}? 

3.172. Suppose G is a k-regular graph with n vertices. (a) How many edges are in G? (b) 
If k > 0 and G is bipartite with partite sets A and B, prove that |A| = |B]. 

3.173. Fix k > 2. Prove or disprove: there exists a k-regular bipartite graph G such that 
G has a cut-edge. 


3.174. Prove that an n-vertex graph G in which every vertex has degree at least (n — 1)/2 
must be connected. 


3.175. Let G be a forest with n vertices and k connected components. Compute 
vev(a) de8q(v) in terms of n and k. 


3.176. The arboricity of a simple graph G, denoted arb(G), is the least n such that there 
exist n forests F, with V(G) = Uj_, V(F;) and E(G) = U;_, E( 4). Prove that 


|E(A)| |. 


arb(G) = args Re 


where H ranges over all induced subgraphs of G with more than one vertex. (It can be 
shown that equality holds [99].) 
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3.177. Show that any tree not isomorphic to a path graph P,, (see 3.124(e)) must have at 
least three leaves. 


3.178. Let T be a tree. Show that deg;(v) is odd for all v € V(T) iff for alle € E(T), both 
connected components of (V(T’), E(T) ~ {e}) have an odd number of vertices. 


3.179. Helly Property of Trees. Suppose T, 7), ..., 7; are trees, each T; is a subgraph 
of T, and V(T;) NV(T;) #0 for all i, j < k. Show that _, V(T;) £0. 


3.180. Let G be a tree with leaves {v1,...,Um}. Let H be a tree with leaves {wi,..., Wm}. 
Suppose that, for each 7 and j, the length of the unique path in G from vu; to v; equals the 
length of the unique path in H from w; to w;. Prove G = H. 


3.181. For 1 <n < 7, count the number of isomorphism classes of trees with n vertices. 


3.182. (a) How many isomorphism classes of n-vertex trees have exactly 3 leaves? (b) How 
many trees with vertex set {1,2,...,n} have exactly 3 leaves? 


3.183. How many trees with vertex set {1,2,...,n} have exactly k leaves? 


3.184. Let K,, be the complete graph on n vertices (see 3.124). (a) Give a bijective or 
probabilistic proof that every edge of K,, appears in the same number of spanning trees of 
Ky. (b) Use Cayley’s theorem to count the spanning trees of K,, that do not use the edge 


{1, 2}. 


3.185. Use 3.75 to find the number of trees T with V(T) = {1,2,...,8} and deg(T) = 
[3, 8,3, 1,1, 1,1, 1). 


3.186. Let t,, be the number of trees on a given n-element vertex set. Without using 
Cayley’s theorem, prove the recursion 


n—-1 
n—2 
th = k a ea 
S(t) tates 
3.187. (a) Use the pruning bijection to find the word associated to the tree 


T = ({0,1,...,8}, {{1, 5}, {2, 8}, {3, 7}, (7, OF, {6, 2}, {4, 7}, {5, 4}, {2, 4}}). 


(b) Use the inverse of the pruning bijection to find the tree with vertex set {0,1,...,8} 
associated to the word 1355173. 


3.188. Use the inverse of the pruning bijection to find all trees with vertex set {1,2,...,7} 
associated to the words in R(11334). 


3.189. Let G be the graph with vertex set {+1,+2,...,+n} and with an edge between i 
and —j for all 7,7 € {1,2,...,n}. (a) Show that any spanning tree in G has at least one 
positive leaf and at least one negative leaf. (b) Develop an analogue of the pruning map 
that sets up a bijection between the set of spanning trees of G and pairs of words (wu, v), 
where u € {1,...,n}"~1 and v € {-1,...,—n}"~1. Conclude that G has n?"~? spanning 
trees. 


3.190. (a) How many words in R(09112'374') are terms? (b) How many words in 
R(0°11233') are lists of n terms? What is n? 


3.191. Given w = 00220000201030, use the proof of the cycle lemma 3.90 to find all 7 such 
that the cyclic rotation R;(w) is a list of 4 terms. 
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3.192. Consider a product 21 X %2 X +++ X Zp where the binary operation x is not nec- 
essarily associative. Define a bijection from the set of complete parenthesizations of this 
product to the set of terms in R(0™2™~+). Then use 3.91 to show that the number of such 
parenthesizations is given by a Catalan number. 


3.193. Let y,(x) be the chromatic polynomial for the graph C,, consisting of n vertices 
joined in a cycle. Prove that 


Xn(#) = (e@—1)"+(-1)"@—-1) (mB 2). 


3.194. Find the chromatic polynomials for the graphs in 3.124(a),(b),(c),(d). 
3.195. Find the chromatic polynomial and chromatic number for the graph G2 in Figure 3.1. 
3.196. Find two non-isomorphic simple graphs with the same chromatic polynomial. 


3.197. A certain department wishes to schedule meetings for a number of committees, 
whose members are listed in the following table. 


Committee 


Advisory Driscoll, Loomis, Lasker 


Alumni Sheffield, Loomis 

Colloquium Johnston, Tchaikovsky, Zorn 
Computer Loomis, Clark, Spade 

Graduate Kennedy, Loomis, Trotter 

Merit Lee, Rotman, Fowler, Sheffield 
Personnel Lasker, Schreier, Tchaikovsky, Trotter 


Undergraduate | Jensen, Lasker, Schreier, Trotter, Perkins 


(a) What is the minimum number of time slots needed so that all committees could meet 
with no time conflicts? (b) How many non-conflicting schedules are possible if there are six 
(distinguishable) time slots available? (c) Repeat (a) and (b), assuming that Zorn becomes 
a member of the merit committee (and remains a member of the colloquium committee). 


3.198. Let K,, be the complete graph on n vertices (see 3.124). (a) How many subgraphs 
does K,, have? (b) How many induced subgraphs does K,, have? 


3.199. Prove that a graph G has at least one spanning tree iff G is connected. 
3.200. Fill in the details of the proof of 3.111. 
3.201. Use the spanning tree recursion 3.108 to find r(G,) for the graph G, in Figure 3.1. 


3.202. Let T; and T2 be spanning trees of a graph G. 
(a) If e; € E(T1) ~ E(T2), prove there exists e2 € E(T2) ~ E(T)) such that 


Ts = (V(G), (E(Ti) ~ {e1}) U fe2}) 


is a spanning tree of G. 
(b) If ey € E(T) ~ E(T2), prove there exists eg € E(T2) ~ E(T,) such that 


Ts = (V(G), (E(T2) U fer}) ~ fe2}) 
is a spanning tree of G. 


3.203. Fix k > 3. For each n > 1, let G, be a graph obtained by gluing together n regular 
k-gons in a row along shared edges. The picture below illustrates the case k = 6, n = 5. 
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Let Go consist of a single edge. Prove the recursion 

T(Gy) = kt(Gn_-1) — T(Gn-2) (n > 2). 
What are the initial conditions? 
3.204. Given a simple graph G, let G ~ v be the induced subgraph with vertex set V(G) ~ 
{v}. Assume |V(G)| = n > 3. (a) Prove that |E(G)| = (n — 2)7! vev(a) |E(G ~ v)|. (b) 
Prove that, for v9 € V(G), dege(vo) = (n — 2)74 Dwevicn L(G ~ v)| — |E(G ~ v9)|- 


3.205. For each graph in 3.124(a) through (f), count the number of spanning trees by direct 
enumeration, and again by the matrix-tree theorem. 


3.206. Confirm by direct enumeration that the digraph in Figure 3.25 has 16 spanning 
trees rooted at 0. 


3.207. Let G be the graph with vertex set {0,1}° such that there is an edge between 
v,w © V(G) iff the words v and w differ in exactly one position. Find the number of 
spanning trees of G. 


3.208. Let I be the m x m identity matrix, let J be the m x m matrix all of whose entries 
are 1, and let t,u be scalars. Show that det(tl — uJ) =t™ — mt™~1u. 


3.209. Deduce Cayley’s theorem 3.72 from the matrix-tree theorem 3.114. 


3.210. Let A and B be disjoint sets of size m and n, respectively. Let G be the simple graph 
with vertex set AU B and edge set {{a,b}:a€ A,b € B}. Show that 7(G) = m"~!n™1, 


3.211. How many closed Eulerian tours starting at vertex 5 does the digraph in Figure 3.26 
have? 


3.212. An Eulerian tour of a graph G is a walk in G that uses every edge exactly once and 
visits every vertex. (a) Find necessary and sufficient conditions for a graph to have a closed 
Eulerian tour. (b) Find necessary and sufficient conditions for a graph to have an Eulerian 
tour. 


3.213. Consider a “digraph with indistinguishable edges” consisting of a vertex set V and 
a multiset of directed edges (u,v) € V x V. Formulate the notion of a closed Eulerian tour 
for such a digraph, and prove an analogue of 3.122. 


3.214. de Bruijn Sequences. Let A = {2x1,...,%n} be an n-letter alphabet. For each 
k > 2, show that there exists a word w = wow, ++: w,%_1 such that the n* words 


WiWi41 *** Witk-1 (0<i<n*) 
(where subscripts are reduced mod n*) consist of all possible k-letter words over A. 


3.215. The Petersen graph is the graph G with vertex set consisting of all two-element 
subsets of {1,2,3,4,5}, and with edge set {{A, B} : AN B = @}. (a) Compute the number 
of vertices and edges in G. (b) Show that G is isomorphic to each of the graphs shown here. 
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(c) Show that G is 3-regular. (d) Is G bipartite? (e) Show that any two non-adjacent vertices 
in G have exactly one common neighbor. 


3.216. Find (with proof) all k such that the Petersen graph has a cycle of length k. 


3.217. Given any edge e in the Petersen graph G, count the number of cycles of length 5 
in G that contain e. Use this to count the total number of cycles of length 5 in G. 


3.218. (a) Prove that the Petersen graph G has exactly ten cycles of length 6. (b) How 
many claws (see 3.124) appear as induced subgraphs of G? 


3.219. How many spanning trees does the Petersen graph have? 


i Ee 


Notes 


Our coverage of graph theory in this chapter has been limited to a few enumerative topics. 
Systematic expositions of graph theory may be found in [14, 17, 18, 27, 59, 67, 136, 143]; the 
text by West is especially recommended. Roberts [114] gives a treatment of graph theory 
that emphasizes applications. 

The bijection used to enumerate rooted trees in 3.47 is due to Egecioglu and Remmel [32]. 
The original proof of Cayley’s formula 3.75 appears in Cayley [24]. The pruning bijection 
described in §3.12 is due to Priifer [105]; the image of a tree under this map is often called 
the Priifer code of the tree. For more on the enumeration of trees, see Moon [96]. 

A version of the cycle lemma 3.90 occurs in the work of Dvoretsky and Motzkin [31]. 
This lemma and other equivalent results have been independently rediscovered (in various 
guises) by many authors. Our discussion of the enumeration of lists of terms in §3.14 closely 
follows Raney’s classic paper on Lagrange inversion [107]. 

The matrix-tree theorem for undirected graphs is usually attributed to Kirchhoff [76]; 
Tutte extended the theorem to digraphs [132]. The enumeration of Eulerian tours in 3.122 
was proved by van Aardenne-Ehrenfest and de Bruijn [133]. 
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Inclusion-Exclusion and Related Techniques 


This chapter studies combinatorial techniques that are related to the arithmetic operation 
of subtraction: involutions, inclusion-exclusion formulas, and Mobius inversion. Involutions 
allow us to give bijective proofs of identities involving both positive and negative terms. 
The inclusion-exclusion formula extends the sum rule 1.2 to a rule for computing |A1U Ag U 
---UA,,| in the case where the sets A; need not be pairwise disjoint. This formula turns 
out to be a special case of the general Mobius inversion formula for posets, which has many 
applications in number theory and algebra as well as combinatorics. 


4.1 Involutions 


In Chapter 2, we saw how to use bijections to prove combinatorial identities. Many iden- 
tities involve a mixture of positive and negative terms. One can use involutions to furnish 
combinatorial proofs of such identities. We illustrate the idea using the following binomial 
coefficient identity. 


4.1. Theorem. For all n > 1, YS peo(—1)* (FZ) =. 


Proof. The result can be proved algebraically by using the binomial theorem 2.14 to expand 
the left side of (—1+ 1)” = 0. To prove the identity combinatorially, let X be the set of all 
subsets of {1,2,...,n}. For each S € X, we define the sign of S to be sgn(S) = (—1)!5I. 
Since there are (;') subsets S$ of size k, and sgn(S') = (—1)* for all such subsets, we see that 


Thus we have found a combinatorial model for the left side of the desired identity, which 
involves signed objects. 

Now, define a function I: X — X as follows. Given S € X, let I(S) = SU {1} if 
1 ¢ 8S, and let [(S) = S ~ {1} if 1 € S. Observe that [(1(S)) = S for all S € X; in other 
words, [oI = idx. Thus, J is a bijection that is equal to its own inverse. Furthermore, since 
|Z(S)| = |S|+ 1, sgn(I(S)) = —sgn($) for all S € X. It follows that I pairs each positive 
object in X with a negative object in X. Consequently, the number of positive objects in 
X equals the number of negative objects in X, and so }\ ocx sgn(S) = 0. oO 


The general setup for involution proofs is described as follows. 


4.2. Definition: Involutions. An involution on a set X is a function I: X — X such 
that Io I = idx. Equivalently, I is a bijection on X and J = I~!. Given an involution J, 
the fixed point set of I is the set Fix() = {x € X : I(x) = x}, which may be empty. If 
sen: X — {+1,—1} is a function that attaches a sign to every object in X, we say that I is 
a sign-reversing involution (relative to sgn) iff for all a € X ~ Fix(J), sgn(J(a#)) = —sgn(za). 
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4.3. Involution Theorem. Given a finite set X of signed objects and a sign-reversing 
involution J on X, 

> sgen(X) = So sgn(X). 

EX x€Fix(L) 


Proof. Let Xt = {x € X ~ Fix(J) : sgn(z) = +1} and X~ = {x € X ~ Fix(I) : 
sen(z) = —1}. By definition, J restricts to X* and X~ to give functions It : X* — X— 
and I~ : X~ + X* that are mutually inverse bijections. Therefore, |X*| = |X~| and 


S sgn(X) = os sen(a) + S- segn(x) + S- sen(2) 


LEX rEext rEx- xe€Fix(L) 
= |X*|-|X-|+ So sen(x)= S° sen(z). O 
w€Fix(I) x€Fix(1) 


As a first illustration of the involution theorem, we prove a variation of 4.1. 


4.4. Theorem. For all n > 1, 


yo) (7") = (1 es ') 


k=0 


Proof. Let X be the set of all subsets of {1,2,...,2n} of size at most n, and let the sign of 
a subset T be (—1)!7!. The left side of the desired identity is rex Sgn(T). Next, define an 
involution J on X as follows. If T ¢ X and1leéT, let 1(T)=T~ {1}. If TeX and1¢T 
and |T| < n, let 1(T) = TU {1}. Finally, if T © X and 1 ¢T and |T| =n, let I(T) = T. 
One checks immediately that J is a sign-reversing involution. The fixed points of I are the 
n-element subsets of {1,2,...,2n} not containing 1. There are Co) such subsets, and each 


n 


of them has sign (—1)". So }/rerix(7) Sgu(T) is the right side of the desired identity. O 


4.5. Theorem. For all n > 0, 


seor(z) =| om 0 ; if n is odd; 


hao a4 (nya) if n is even. 


Proof. Let X be the set of all pairs (S,T), where S and T are subsets of {1,2,...,n} 
of the same size. Define sgn($,7) = (—1)!5!. Then the left side of the desired identity is 
D(s,r)ex 88n(S,T). We define an involution J on X as follows. Given (S,T’) € X, let i 
be the least integer in {1,2,...,n} (if there is one) such that either i ¢ S and i ¢ T, or 
i € S andi € T. In the former case, let [(.S,T) = (SU {i}, TU {2}); in the latter case, let 
I(S,T) =(S ~ {i}, T ~ {i}); if no such ¢ exists, let [(.S,T) = ($,T). It is routine to check 
that I is a sign-reversing involution; in particular, the designated integer 7 in the definition 
of I(.S,T) is the same as the 7 used to calculate [(I(S,T)). By the involution theorem, 


Set) = YS eve 


k=0 : (S,T)€Fix(Z) 


Note that ($,T) € Fix(/) iff for every i < n, i lies in exactly one of the two sets S or T. 
This can only happen if n is even and |$| = |T| = n/2 and S = {1,2,...,n} ~ T. So the 
fixed point set is empty if n is odd. If n is even, we can construct an arbitrary element of 
Fix(I) by choosing any subset S of size n/2 and letting T be the complementary subset of 
{1,2,...,n}. Since there are (72) choices for S, each with sign (—1)"/?, the formula in the 
theorem is proved. O 
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4.6. Example: Stirling Numbers. Recall that s(n,k) = (—1)"~"e(n,k), where c(n, k) 
is the number of permutations of an n-element set whose functional digraph consists of k 
cycles (§3.6). We will show that 


So s(n,k)=xX(n=1) (n> 1). 
k=1 

Both sides are 1 when n = 1, so assume n > 1. Let X be the set of all permutations 
of {1,2,...,n}. If w € X is a permutation with k cycles, define sgn(w) = (—1)*. Now 
Se wex sgn(w) = (—1)" 2f¢_, s(n, k), so it suffices to define a sign-reversing involution I on 
X with no fixed points. Given w € X, the numbers 1 and 2 either appear in the same cycle 
of w or in different cycles. If 1 and 2 are in the same cycle, let the elements on this cycle 
(starting at 1) be 


(1,21, 22, +++5>Uk, 2, Y1,Y2,--- Ui), 
where j,k > 0. Define I(w) by replacing this cycle by the two cycles 


Cl ap sts, ro BK)(2, Yas Yrs a Yj) 


and leaving all other cycles the same. Similarly, if 1 and 2 are in different cycles of w, write 
these cycles as 


(Tas L2Q,+-- ,tr)(2,y1, Y2,+-- Yj) 
and define I(w) by replacing these two cycles by the single cycle 


(laa (ae, ae » Lk, 2, Y1, Y2,++ gan) 


It is immediate that I o I = idx, I is sign-reversing, and J has no fixed points. 
We can modify the preceding involution to obtain a combinatorial proof of the identity 


S- s(t, k)S(k, 9) =x = 3), 


k>0 


which we proved algebraically in part (d) of 2.77. Ifi < 7, then for every k, either s(t, k) = 0 
or S(k, 7) = 0. So both sides of the identity are zero in this case. If i = 7, the left side reduces 
to s(i,7)S(i,7) = 1 = x(« = 7). If j = 0, the identity is true. So we may assume i and j 
are fixed numbers such that i > j > 0. Let X be the set of pairs (w,U), where w is a 
permutation of {1,2,...,i} (viewed as a functional digraph) and U is a set partition of the 
set of cycles in w into j blocks. If w has k cycles, let sgn(w,U) = (—1)*. Then 


4 


S> sgn(w,U) = (-1)' © s(t, &)S(k, 9) 


(w,U)EX k=j 


and y(t = j) = 0. So it suffices to define a sign-reversing involution I on X with no fixed 
points. Given (w,U) € X, there must exist a block of U such that the cycles in this block 
collectively involve more than one point in {1,2,...,2}. This follows from the fact that 7 (the 
number of points) exceeds j (the number of blocks). Among all such blocks in U, choose the 
block that contains the smallest possible element in {1,2,...,i}. Let this smallest element 
be a, and let the second-smallest element in this block be b. To calculate I(w,U), modify 
the cycles in this block of U as we did above, with a and b playing the roles of 1 and 2. 
More specifically, a cycle of the form 


(Gy Bio. UiYiuerrey) 
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gets replaced (within its block) by 


(a, Capea 40, Mie: ate 5) 


and vice versa. It is routine to check that J is a sign-reversing involution on X with no fixed 
points. For example, suppose i = 10, 7 = 3, w has cycles (1), (3,5), (2,6, 9), (4,8), (7), (10), 


and 
U = {{(1)}, {(8, 5), (10) }, {(2, 6, 9), (4,8), (7) FF. 


Here the block of U modified by the involution is {(2,6,9), (4,8), (7)}, a = 2, and b = 4. 
We compute I(w,U) by replacing the cycles (2,6,9) and (4,8) in w by the single cycle 
(2,6,9,4,8) and letting the new set partition be 


{{(1)}; {(3, 5), (10)}, {(2, 6, 9, 4,8), (7) bE 


4.2 The Inclusion-Exclusion Formula 


Recall the sum rule: if S1,...,5, are pairwise disjoint finite sets, then |S; U---US,| = 
yy, |Si|. Can we find a formula for |S; U---U S| in the case where the given sets S; are 
not necessarily disjoint? The answer is provided by the inclusion-exclusion formula, which 
we discuss now. 

We have already seen the simplest case of the inclusion-exclusion formula. Specifically, 
if S and T are any two finite sets, the binary union rule 1.4 states that 


|ISUT| =|S|+|T|-|SOTI. 


Intuitively, the sum || + |Z'| overestimates the cardinality of |S UT’ because elements of 
|S T| are included twice in this sum. To correct this, we exclude one copy of each of the 
elements in ST by subtracting |S. T]. 

Now consider three finite sets S, T’, and U. The sum |S| + |7Z| + |U| overcounts the 
size of |S UT UU| since elements in the overlaps between these sets are counted twice (or 
three times, in the case of elements z € SMTMU). We may try to account for this by 
subtracting |SNT|+|SAU|+|£NU| from |S| + |T|+ |U]. If 2 belongs to S and U but 
not T (say), this subtraction will cause x to be counted only once in the overall expression. 
A similar comment applies to elements in (SN T) ~ U and (INU) ~ S. However, an 
element z € SNTNU is counted three times in |S|+|T|+|U| and subtracted three times in 
ISNT|+|SNU|+|TNU|. So we must include such elements once again by adding the term 
ISA TNOU|. In summary, we have given an informal argument suggesting that the formula 


ISUTUU|=|S|/+|T]/+|U|-|SAT|—-|Sau|-|TOU|+|SaTOU 


should be true. 
Generalizing the pattern in the preceding example, we arrive at the following formula, 
known as the inclusion-exclusion formula. 


4.7. Inclusion-Exclusion Formula. Suppose n > 0 and $},...,5, are any finite sets. 
Then 


|S: US2U++-U Sp] = S0(-1)*"! a ISS tea At) 


k=1 1<i1 <ig<--<ip<n 
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4.8. Example. If n = 4, the inclusion-exclusion formula for |S, U Sz U $3 U S4| is 
[S| + |S] + [$3] + |.S4| 
—|519 Sg] — |.$19$3] — [$19 S4| — [S29 $3] — |S2M Sa} — [S39 S4| 
+1519 S2M $3] + [$19 S2N S4| + $19.53 S4| + [S253 S4| 
—|519 529 $3 N Sa]. 


4.9. Remark. By setting I = {71,%2,...,i,}, the inclusion-exclusion formula can also be 


written 
()s: 
wel 


|S, U---USp| = Se (<1)1-4 
OAIC{1,2,...,n} 


We will give several proofs of the inclusion-exclusion formula. Each proof illustrates 
different techniques and can be generalized in different ways. 


4.10. Proof of Inclusion-Exclusion by Induction. We prove that (4.1) holds for all 
n > 0 and all finite sets S,,...,5, by induction on n. The formula reduces to |S | = |S1| 
for n = 1, and this is certainly true. For n = 2, the formula becomes 


|S; U S2| = [Si] + |S2| — [$1.9 $9, 


and this is the binary union rule 1.4 proved previously. Now assume n > 2 and that formula 
(4.1) is already known to hold for any union of n — 1 finite sets. Let S1,...,5, be fixed 
finite sets. The n-fold union S$; U---U S$, can be regarded as the union of the two sets 
S=$,US,U---USp,_1 and T = S,,. Hence, by the binary union rule 1.4, 


|S, U---U S| = $1 U +++ U Sp_a| + [Sn] — |[($1 Us +» U Sp_1) N Sa. 


Since the set operations M and U obey the distributive law, we can write the subtracted 
term as 
($1 A S,)U (S.A Sp) U-+-U(Sp-1N Sy)I, 


which is the union of the n — 1 finite sets S$; S$, (1 < i < n—1). So we can apply the 
induction hypothesis to this term, and to the first term |S; U---U S,_1|. We obtain 


n-1 
|S1U++-U Sp] = [Spl + So (-1)* a ave meaeca| 
k=1 1S <s<ip<n-1 
n-1 
-So(-1)77 ye (Si, AS) N=, A Sp)]- 
j=l 1<iy <-<ij<n-1 


We modify the second line of this formula as follows. First, observe that 
J 
() ($i, Sp) = Si Sig N+ Si, VS 
r=1 


Next, change the summation index by setting k = 7 + 1 and defining 7, = n. The formula 
now reads 


n-1 
JS1U--U Sp) = SO(-DPR SS [Sie Si, | 
k=1 1<ip <1 <ip<n 


+|S,|+ 5°(-1*4 S- [Si er 8, |, 
k=2 


1<iy <<ip_i<ipan 
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We can absorb |S,,| into the sum on the second line by allowing k to range from 1 to n 
there. Also, letting k range from 1 to n in the first summation does not introduce any new 
terms. After making these adjustments, the only difference between the formulas on the 
first and second lines is that i, < n in the first line while 7, = n in the second line. We can 
now combine the two summations to obtain 


n 


|S1U---USnp=S°(-DF A SE SN SKI, (4.2) 


k=1 1<iy <+<ipSn 
which is the desired formula (4.1). This completes the induction. 


In some counting problems, the following versions of the inclusion-exclusion formula are 
needed. 


4.11. Alternate Version of Inclusion-Exclusion Formula. Suppose S},...,5,, are 
subsets of a finite set X. The number of elements x € X that belong to none of the S; is 


|X ~ ($1 U-++USp)| = |X| + $o(-14 S- [Si NSi, N+ Si, |- 
k=1 


1St1 <ig<++<ipSn 
This formula follows from the original inclusion-exclusion formula and the difference rule 1.3. 


Intuitively, the preceding formula is applicable when we are trying to count objects in X 
that must simultaneously avoid a number of specified “bad” properties. Each set $; consists 
of those objects in X that have the ith bad property (and possibly other bad properties 
too). 


4.12. Simplified Version of the Inclusion-Exclusion Formula. Let S),...,5, be 
finite sets. Suppose that for all k > 1, the intersection of any k distinct sets among the S;’s 
always has cardinality N(k). In other words, |S;,5;, 1---.5i,| = N(k) for all choices of 
ty <tg < +++ < dy. Then 


[Sy U-+ US| = yy (;,) ve. 


k=1 


If all S;’s are subsets of a given finite set X, we also have 
HX ~ (SU US) =X + HCDE(E) IE 
k=1 


These formulas follow by substituting N(k) for each summand |S;,---.S;,| in the previous 


inclusion-exclusion formulas and noting that there are (7) such summands. 


4.3. More Proofs of Inclusion-Exclusion 


This section presents two proofs of the inclusion-exclusion formula that are more combina- 
torial than the inductive computation already given. 
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4.13. Involution Proof of Inclusion-Exclusion. If we move all terms in (4.1) to the 
left side, we obtain the formula 


Spr SS [San 5;,| =0. (4.3) 


k=0 1<iy <i <i <n 


In this equation, the summand corresponding to k = 0 is defined to be |S; U---US;,|. We 
will prove this formula by introducing an involution on a suitable set of signed objects. 

Let X be the set of all sequences (2; %1,i2,...,%%) such that 1 <i, <ig << +++ <ig <n, 
O0<k<n, andre $;,N---NS;,. (If k = 0, then the object looks like (x; ), and the last 
condition is interpreted to mean x € $1 U--+U Sn.) Define sgn(a; 71, %2,...,ix) = (—1)*. It 
follows from the sum rule that }>,.. sgn(z) is the left side of (4.3). So it suffices to define 
a sign-reversing involution on X with no fixed points. 

Given z = (#3 %1,...,7%) € X, we must have x € $;US2U---US,, no matter what the value 
of k is. Let i be the minimum index in {1,2,...,n} such that x € S;. By definition of X, we 
either have k = 0 or i < t, ori = 7%. If k = 0 ori < ty, define [(z) = (437,71, t2,...,%,). If 
instead 7 = 71, define I(z) = (x; %2,...,%,). It is immediate that I([(z)) = z and sgn(I(z)) = 
—sgn(z) for all ze X. 


The preceding proof is quite ingenious, since it establishes a rather complicated formula 
by a remarkably simple bookkeeping bijection. On the other hand, it would be nice to 
have a combinatorial proof of inclusion-exclusion that is tied more closely to the intuitive 
“including and excluding” arguments we used originally to guess the formula for |SUT UU]. 
We present such a proof next. 


4.14. Counting Proof of Inclusion-Exclusion. Fix n finite sets $),...,S,, and put 
X = S,U---US,. We consider a large matrix A whose rows are indexed by the elements 
x € X and whose columns are indexed by all nonempty subsets T of {1,2,...,n}. Define 


the entry in row x and column T of A to be (—1)!7!"! iff z € (),<7 Si, and define this entry 
to be zero otherwise. The sum of the entries in the column of A indexed by T = {i1 < ig < 
+++ Sip} C {1,2,...,n} is 

Gr sansa: 


Adding up all these column sums, we see that the sum s of all entries in A is 


s= Sy NE |S; NAS; |. 


k=1 1<i1 <ig<- <i, <n 


Now, let us compute s by adding up the row sums of A. Intuitively, the sum of the 1’s 
and —1’s in row x of A represents the net number of times x has been counted in the 
inclusion-exclusion sum written above. We claim that this number is 1 for all x © X, so 
that the sum of the row sums is s = |X| = |S; U---US,|. This will complete the proof of 
the inclusion-exclusion formula. 

Fix « € X, and let U = {t1 < ig < --+ < im} be the set of all indices 2; such that 
rE Si;- We have m > 0 since = lies in at least one $;. The entry in row x and column T of 
A is (—1)!7!-! if T CU, and this entry is zero if T is not a subset of U. Note that there are 
(‘”) subsets of U of size k, each of which contributes (—1)*~! to the row sum. Grouping all 
such terms together and invoking the binomial theorem, we conclude that the sum of the 
entries in row x of A is 


> i (-1)** =1- 3 ) (—1)¥1™-* = 1-(-141)™ =1. 
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| 


4.4 Applications of the Inclusion-Exclusion Formula 


We can use the inclusion-exclusion formula to count complicated combinatorial collections 
that cannot be conveniently enumerated by the sum and product rules alone. Recall that, 
when using inclusion-exclusion, we often set up the problem so that each set 5; consists of 
those objects in some big set X that have a certain “bad” property. Our desired answer is 
then the cardinality of X ~ (S,U---US,,), which is given by the inclusion-exclusion formula 
in 4.11. 


4.15. Example: Bridge Hands. A bridge hand is a 13-element subset of a 52-card deck. 
A face card is a jack, queen, king, or ace. How many bridge hands have at least one of 
each kind of face card? To answer this question, let X be the set of all bridge hands; 
note |X| = eae Define 5S; (resp. S2,53,54) to be the set of all hands in X that do not 
have a jack (resp. queen, king, ace). The card hands we want are the elements of the set 
X ~ ($1, U Sp U S3 U S4). We must now compute the sizes of the various intersections 
Si,9---OS;,. Note that |S1| = es) since we can build all hands in S; by choosing 13 cards 
out of the 48 non-jacks in the deck. Similarly, |S2| = |S3| = |S4| = (78). Next, |$1953| = ({9) 
since we can build hands in 5S; S3 by choosing 13 cards out of the 44 cards in the deck 
that are neither jacks nor kings. The same formula holds for all other twofold intersections. 
Similarly, each threefold intersection has size (aa)s while |S, 9S2NS3NS4| = Cs It follows 
from inclusion-exclusion that the answer to the original question is 


52 48 4A 40 36 
i) -4(%5) +6(1) -4(%8) if & = 128,971, 619, 088. 


Next, how many 13-card bridge hands have at least one jack, at least one queen, and at 
least one king, but do not contain any ace cards or spade cards? The last condition can be 
dealt with as follows: throw out the 13+ 4 — 1 = 16 aces and spades at the outset, leaving 
52 — 16 = 36 cards. An inclusion-exclusion argument like the one in the last paragraph now 


leads to the answer 
36 33 30 27 
_ = = 11 : 
(is) (73) +3(35) @ 930, 511, 530 


4.16. Example. How many words w € X = R(1?273?---n?) never have two adjacent 
letters that are equal? Note first that |X| = (ee) = (2n)!/2”. For 1 <i <n, let S; be 
the set of words in X in which the two copies of letter 7 are adjacent to each other. We wish 
to count the words in X ~ ($; U---US,). To do so, fix i < ig < +++ < i, and consider a 
typical intersection 5;,--- 5;,. Given a word w in this intersection, form a new word 
by replacing the two consecutive copies of i; by a single copy of i;, for 1 < 7 < k. This 
operation defines a bijection from S;,M---7.5;, onto the set R(1%2% ---n%"), where a; = 1 
if i = 1, for some j, and a; = 2 otherwise. (The inverse bijection replaces each i; by two 
consecutive copies of 7;.) It follows that 


1k + 2(n — k) i 
Si,9°°-NS;,| = = (2n—k)!/2"~". 
| | Ge a eSB) 
Sey $e 
k n—-k 
This expression does not depend on the indices i;,...,7,. Also, when k = 0, this expression 


reduces to |X|. Using the simplified inclusion-exclusion formula 4.12, we conclude that 


X~ (SU US9) = Eye) 


k=0 
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For our next examples, we use inclusion-exclusion to enumerate certain combinatorial 
collections that have arisen in earlier chapters. 


4.17. Theorem: Enumeration of Surjections. Let Surj(m,n) be the number of surjec- 
tions from an m-element set onto an n-element set. If m >n > 1, then 


Suxj(m,n) = S2(-1)* (Tm — a". 


k=0 


Proof. Let X be the set of all functions f : {1,2,...,m}— {1,2,...,n}. Note that |X| = 
n™. For 1 <i <n, let S; consist of all functions f € X such that 7 is not in the image of 
f. A function f € X is a surjection iff f belongs to none of the S;. Thus, we must compute 
|X ~ (S,U---U.S;,)|. Consider a typical intersection $;,9---7S;,, where 11 < ig <-+++< ix. A 
function f belonging to this intersection is the same thing as an arbitrary function mapping 
{1,2,...,m} into the (n — k)-element set {1,2,...,n} ~ {i1,%2,...,i,}. The number of 
such functions is (1—k)™, independent of i1,..., ix. Using the simplified inclusion-exclusion 
formula 4.12, we get 


n 


Surj(m,n) = |X ~ ($1, U---US;,)| =n™ So(- (;) (n—k)”™, 
k=1 


which is equivalent to the formula of the theorem. O 


Since S(m,n) = Surj(m,n)/n! by 2.58, we deduce the following formula for Stirling 
numbers of the second kind. 


4.18. Theorem: Summation Formula for Stirling Numbers of the Second Kind. 


(min) = (1) (n= ay = 
k=0 


k=0 
Our next illustration of inclusion-exclusion comes from number theory. 


4.19. Definition: Euler’s ¢ Function. For each integer m > 1, let d(m) be the number 
of integers x € {1,2,...,m} such that gced(az,m) = 1. 


For example, if m = 12, then the relevant integers x are 1, 5, 7, and 11, so ¢(12) = 4. 
The function ¢ is prominent in algebra and number theory and has applications to modern 
cryptography. 


4.20. Theorem: Formula for ¢(m). Suppose an integer m > 1 has prime factorization 
m = pps? --- per. Then 


o(m) = [Lore -l)= mT Ta — 1/pi). 


Proof. Let X = {1,2,...,m}, and let 5; = {a € X : p;|x}. (The symbol p;|z means that p; 
divides x.) By the fundamental theorem of arithmetic, « € X is not relatively prime to m 
iff 2 and m have a common factor greater than 1 iff ¢ and m have a common prime factor. 
It follows that 


o(m) = |X ~ (S$; US2U---US,)I. 
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So we are in a position to use inclusion-exclusion. Here it is convenient to write the inclusion- 
exclusion formula as follows: 


IX~(SU--US = SD (all 


IC{1,2,...,n} 


rs 


tel 


9 


where we interpret (),<g Si as the set X. Fix a subset I = {t) <--- < iz} C {1,2,...,n}, 
and consider the intersection S;,---5;,. An integer « < m lies in this intersection iff 
pi,|v for 1 < 7 <k iff the product q = p;,pi. +++ pi, divides x iff x is a multiple of g. Now, 
the number of multiples of q between 1 and m is m/q = m/T|],<, pi. If J = 0 and the empty 
product is interpreted as 1, this expression becomes m = |X|. Hence, the inclusion-exclusion 
formula can be written 


o(m) =|X~(S1U--US =m Il 


On the other hand, consider what happens when we expand the product 


Ae) 


using the generalized distributive law (cf. 2.7). We will obtain a sum of 2” terms, each 
of which is obtained by choosing either 1 or ain from the ith factor of the product and 
multiplying these choices together. We can index these 2” terms by subsets J C {1,2,..., n}, 
where 7 € I iff we chose ie from the ith factor. It follows that 


1 


m[T (1-3) =m ul = am. o 


Pi TC{1,2,...,.n} 


4.21. Remark. We sketch an alternative proof of the formula for ¢(m) that avoids 
inclusion-exclusion. This proof sketch will use some facts from algebra and number the- 
ory without proof. For any commutative ring R, we let R* be the set of units in R; ice., 
the set of « € R such that there exists y © R with cy = yx = 1r. The following facts 
are routinely verified. First, if R and S are isomorphic rings, then |R*| = |S*|. Second, 
given a product ring R x S, we have (R x S)* = R* x S* and hence (by the product rule) 
(Rx S)*| = |R*|-|S*|. Third, gcd(a,n) = 1 iff there exist integers y, z with ey+nz = 1 iff 
x has a multiplicative inverse in the ring of integers modulo n. So ¢(n) = |(Z/nZ)* |. Fourth, 
by the Chinese Remainder Theorem, the rings Z/mnZ and Z/mZ x Z/nZ are isomorphic 
whenever gcd(m,n) = 1. Combining these four facts, we see that gcd(m,n) = 1 implies 


o(mn) = |(Z/mnZ)*| = |(Z/mZ x Z/nZ)*| = |(Z/mZ)*| - |(Z/nZ)*| = o(m)9(n). 


Iteration of this result gives 
n 


o(pt ++ ps) = |] o@*) 
i=1 
whenever pj,..-,Pn are distinct primes. Thus, it suffices to evaluate ¢ at prime powers. 
But a direct counting argument using the difference rule and the definition of ¢@ shows that 
o(p°) = p® — p®-! = p+ (p— 1) when p is prime and e > 1. So we obtain the first formula 
for (n) given in 4.20. 
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4.5 Derangements 


The inclusion-exclusion formula allows us to enumerate a special class of permutations 
called derangements. Intuitively, a derangement of 1,2,...,n is a rearrangement of these 
n symbols such that no symbol remains in its original position. The formal definition is as 
follows. 


4,22. Definition: Derangements. A derangement of aset S is a bijection f : S — S such 
that f(a) # x for all x € S. For n > 0, let D, be the set of derangements of {1,2,...,n}, 
and let d, = |Dn|- 


Note that do = 1 (since the function with empty graph satisfies the definition of derange- 
ment), while d; = 0. To give more examples of derangements, let us identify an element 
f € Dy, with the word f(1)f(2)---f(m). Then dz = 1 since 21 is the unique derangement 
of two letters. The derangements of three letters are 312 and 231, so that dz; = 2. The 
permutation 5317426 is a derangement of seven letters. 


4.23. Summation Formula for Derangements. For n > 1, the number of derangements 
of n letters is 


. 1 
=n! —1)F— 
dy, =n! S°(-1) mE 
k=0 
Consequently, d, is the closest integer to n!/e for n > 1. 


Proof. Let X be the set of all permutations of n letters; note that |X| = n!. For 1 <i<n, 
let S; = {f € X : f(t) = i}. The set D, consists of precisely those elements in X that 
belong to none of the S;, so D, = X ~ (S$; U---US,). To apply the inclusion-exclusion 
formula, we must consider a typical intersection S;,5;,9---M5;,, where 41 < ig <+++ < dg. 
A permutation f € X belongs to this intersection iff f fixes 11,...,2, and permutes the 
remaining n — k letters among themselves. The number of such permutations is (n — k)!. 
This number depends only on & and not on the indices 7i;,...,i,. Applying the simplified 
inclusion-exclusion formula 4.12, we obtain 


dy, =n! + S°(-1)' a) (n—k)l=n!+ ene - Doar. 


k=1 k=1 


To relate this formula to the expression n!/e, recall from calculus that 
x <. x 
ev = yy ml (x € R). 


Setting « = —1, we see that 


Multiplying by n! and comparing to our formula for d,,, we see that 


oO (_4)k 
n!/e—dy, =n! S- = 


k=n+1 
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It now suffices to show that the right side of this formula is less than 1/2 in absolute value. 
Factoring out may from each term in the series, we obtain 


1 1 1 1 


le l= al nae ama) Dram ss 


Ab nse, 


The series within the absolute values on the right side is an alternating series that converges 
to a sum strictly less than 1. Since n > 1, it follows that 


1 
Wem gale ae 
Inl/e~ dn| < 5-1 < 1/ o 


The following table lists the first few values of dy. 


n | 0 
1 


2/3/4] 5] 6] 7 8 9 
dn 219 


1 
0] 1 44 | 265 | 1854 | 14,833 | 133,496 


Like any permutation, a derangement has a functional digraph consisting of the disjoint 
union of one or more cycles. A permutation is a derangement iff there are no 1-cycles in its 
functional digraph. This observation leads to the following recursion for derangements. 


4.24. Theorem: Recursion for Derangements. We have dp = 1, d; = 0, and 
dn, = (n — 1)dn_1 + (n — 1)dn_2 (n > 2). 


Proof. Fix n > 2. Write the set of derangements D,, as the disjoint union of sets A and 
B, where A consists of those derangements in which n is involved in a cycle of length 2, 
and B consists of the derangements where n is in a cycle of length greater than 2. To 
build an object in A, choose the partner of n in its 2-cycle (n — 1 ways), and then choose 
a derangement of the remaining objects (D,_2 ways). To build an object in B, choose a 
derangement of the first n — 1 objects (D,_1 ways), consider the functional digraph of this 
derangement, and splice n into a cycle just before any of the n—1 available elements (which 
is guaranteed to create a cycle of length 3 or more). The recursion now follows from the 
sum and product rules. O 


4.25. Theorem: Second Recursion for Derangements. We have dp = 1 and 
dy, = Ndn—1 + (—1)” (n > 1). 
Proof. We argue by induction on n. If n = 1, then 
dp =d; =0=1-1+(-1)'=nd,_1 +(-1)". 
Now assume n > 1 and that dp_; = (n—1)dy—2 +(—1)"~'. We can use this assumption to 
eliminate (n — 1)d,_2 in the first recursion 4.24 for d, (which is already known to hold for 
all n). We thereby obtain 


dy = (n —1)dy_1 + (2 — 1)dy—g = (NM — 1)dp—1 + (dpa — (—1)"72) = ndy_a + (-1)". 


This completes the induction. O 
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4.6 Coefficients of Chromatic Polynomials 


Let G be a simple graph. Recall that .¢(a) denotes the number of proper colorings of the 
vertices of G using x available colors. We have seen in 3.100 that yq(z) is always a polynomial 
in x. In this section, we use inclusion-exclusion to analyze the chromatic polynomial of G. 
This analysis will lead to a combinatorial interpretation for the coefficients of the chromatic 


polynomial y¢(z). 


4.26. Definition: Vertex-spanning Subgraph. Let G = (V(G), E(G)) be a simple 
graph. A vertex-spanning subgraph of G is a subgraph H of G such that V(H) = V(G). 


The map H +> E(H) is a bijection between the set of vertex-spanning subgraphs of G 
and the set of all subsets of E(G). 


4.27. Theorem: Coefficients of Chromatic Polynomials. Let G be a simple graph. 
For each e,c > 0, let n(e,c) be the number of vertex-spanning subgraphs of G with e edges 
and c connected components. Then 


x(x) = S- (—1)©n(e, c)x°. 


e,c>0 


Proof. Let e1,...,€n be the edges of G. Let X be the set of all colorings of G (proper or not) 
using x available colors, and let S; be the set of colorings in X such that both endpoints of the 
edge e; receive the same color. We wish to compute |X ~ (S,U---US;,)|. Consider a typical 
intersection (),-7 Si, where T C {1,2,...,n}. The edge subset {e; : 1 € T} determines 
a vertex-spanning subgraph H of G with |T'| edges and some number cc(H) of connected 
components. One may check that a coloring f belongs to (),-7 Si iff f is constant on 
each connected component of H. It follows from the product rule that |(),-7 Si| = gelH) | 
since we can choose one of x colors for each connected component of H. Note also that 
|X| = alV(Ol = gee(Ho) where Hp = (V(G),0). By inclusion-exclusion, 


xelt) = 1X14 CIID Ss: 


OAT C{1,2,...,n} ieT 
= x (-1) FMI zee) = S° (-1)*n(e,c)a®. O 
vertex-spanning subgraphs H e,c>0 


4.7 Classical Mobius Inversion 


We conclude this chapter with a brief introduction to the theory of Mébius inversion. We 
begin in this section by studying the number-theoretic Mobius function and the classical 
Mobius inversion formula. Later sections discuss the generalization of the Mobius function 
and inversion formula to posets. 


4.28. Definition: Classical M6bius Function. Suppose m > 1 is an integer with prime 
factorization m = pi" ps? --- pS", where n > 0, e; > 0, and the p,’s are distinct primes. (We 
take n = 0 when m = 1.) The Mobius function : Nt — {—1,0, 1} is defined by pu(m) = 0 


if e; > 1 for some i, whereas fu(m) = (—1)” if e; = 1 for all i. 
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In other words, u(m) is zero if m is divisible by the square of a prime; p(m) = +1 if m 
is the product of an even number of distinct primes; and u(m) = —1 if m is the product of 
an odd number of distinct primes. For example, 


w(1) = 1, w(7)=—1, w(10) =1, 412) =0, 4(30) = -1. 
The following theorem is the key to proving the Mobius inversion formula. 


4.29. Theorem. For all integers m 2 1, >) gjm H(d) = x(m = 1). (Here and below, the 
symbol >? 4), means that we sum over all positive divisors d of the integer m.) 


Proof. When m = 1, we have 74), #(d) = w(1) = 1 = x(m = 1). Suppose next that m > 1 
and m has prime factorization pj! ---p$". Instead of summing p(d) over all divisors d of 
m, we may equally well sum over just the square-free divisors d of m, which give the only 
nonzero contributions to the sum. Examining prime factorizations, we see that there are 


2” such square-free divisors, which have the form [],-7~ pi as T ranges over all subsets of 
{1,2,...,n}. Therefore, 


2H = > +(Il)- nt. 


TED, i€T TC{I1,2,...,n} 


Collecting together summands indexed by subsets T of the same size k, we conclude that 
d) = ~1)ITl = SX (| (-1)*1"-* = (-1 4.1)" =0. 
YSe@M=>5 DY CY ys ,) (4) (Ie)) Oo 


4.30. Classical Mobius Inversion Formula. Suppose f and g are functions with domain 
Nt such that 


Then 
g(m) = S> f(m/d)u(d) = S> f(@u(m/d) (m= 1). 
d\|m 


d|m 


Proof. We use the definition of f to expand the first claimed formula for g(m): 


So f(m/du(d) =S5 4} SS ge) u@= S> gu), 


d|m djm \e\(m/d) (c,d)ES 


where S = {(c,d) € Nt x N* : d|m,c|(m/d)}. It follows routinely from the definition of 
divisibility that 


S' = {(c,d) : dlm, cdlm} = {(¢,d) : elm, ed|m} = {(e, d) : elm, d|(m/c)}. 
Therefore, the calculation continues as follows: 


SSS gun @=So90 | SY ua) 
c|m 


(c,djES elm d|(m/c) d|(m/c) 


S¢ g(e)x(m/c = 1) = g(m). 
c|m 


M 
& 
io) 
= 
& 
t 


I 
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The next-to-last step used 4.29 to simplify the inner sum. We conclude that 


=> f(m/d)u(d) = SY f(@u(m/d), 


d|m d|m 
where the final equality results by replacing d by m/d in the summation. O 


To give examples of the Mobius inversion formula, we first introduce some functions 
that are studied in number theory. 


4.31. Definition: Number-Theoretic Functions 7, 0, and o2. Let m > 1 be an integer. 


Define 
= al a(m) aid o2(m) = a. 
d\|m d\|m 


d|m 


Thus, 7(m) is the number of positive divisors of m; a(m) is the sum of these divisors; and 
o2(m) is the sum of the squares of these divisors. 


4.32. Example. Taking m = 1, 4,7, 12,30, we calculate: 


r(1)=1, 7(4)= r(7)=2, 7(12)=6,  7(30)=8; 
o(l)=1, o(4)= o(7)=8, o(12)=28, (30) = 72; 
o2(1)=1, o9(4) = i 02(7) =50, o9(12) =210, 0 2(30) = 1300. 


If m has prime factorization pj! --- pf", then the divisors of m have the form pj’ -- - pin 
where 0 < f; < e; for all 7. The product ile therefore gives T(m) = [Jj_,(e: + 1) (build 
a divisor by choosing f1,..., fr). Using the generalized distributive law and the geometric 
series formula, one may also check that 


n ei : pore 
oom) =T] [ oo} = TT 
i=1 \ fi=0 i=1 y 


Applying the Mobius inversion formula to the definitions of T, 0, and o2, we obtain the 
following identities. 


4.33. Theorem. For m > 1, we have 


1=Sor(m/d)u(d); m= So o(m/d)u(d); —— m? = S° o2(m/d)u(d). 


d|m d|m d|m 
The next result uses M6bius inversion to deduce information about Euler’s ¢ function. 


4.34. Theorem: ¢ versus wu. For all m > 1, 


m=- (d) and so => u(d) )(m/d). 


d|m d|m 
Proof. To prove the first formula, fix m > 1. For each divisor d of m, let 
Sa={xeENt:1<2< mand ged(z,m) = d}. 
It is immediate that the m-element set {1,2,...,m} is the disjoint union of the sets Sq as 


d ranges over the positive divisors of m. Whenever d divides m, we have gcd(x,m) = d iff 
d divides x and gced(a/d,m/d) = 1. It follows that division by d gives a bijection from the 
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set Sq onto the set of numbers counted by ¢(m/d). Therefore, |Sq| = ¢(m/d). By the sum 


rule, 
m=) |Sal = 0 4(m/d) = 0 9. 
d\|m d\|m d\|m 


The last equality follows by noting that the number m/d ranges over all positive divisors of 
m as d ranges over all positive divisors of m. Applying Mobius inversion (with f(m) = m 
and g(m) = ¢(m)), we obtain the second formula in the theorem. Oo 


Some applications of these results to field theory are presented in §12.6. 


DT 


4.8 Partially Ordered Sets 


We will see that the inclusion-exclusion formula 4.7 and the classical Moébius inversion 
formula 4.30 are special cases of the general Mobius inversion formula for partially ordered 
sets (posets). First we must review some definitions and examples concerning posets. 

Recall from 2.54 the definition of relations and the notions of reflexive, irreflexive, sym- 
metric, antisymmetric, and transitive relations. Given a relation R on a finite set X, the 
pair (X, R) is a digraph G with vertex set X and directed edge set R. Reflexivity means 
that every vertex of G has a loop edge; irreflexivity means that no vertex of G has a loop 
edge. Symmetry means that the reversal of every edge is also an edge (so we can think of 
G as undirected); antisymmetry means that it is never true that a non-loop edge and its 
reversal are both in G. Finally, transitivity means that whenever there is a walk (a, y, z) of 
length 2 in G, the edge (x, z) is also present in G. More generally, we see by induction that 
when R is transitive, there exists a walk from x to z in G of positive length iff the edge 
(x, z) is present in G. 


4.35. Poset Definitions. A partial order relation on X is a relation that is antisymmetric, 
transitive, and reflexive on X. A strict order relation on X is a relation that is transitive 
and irreflexive on X. A partially ordered set (poset) is a pair (X,<) where < is a partial 
order relation on X. A totally ordered set is a poset (X,<) such that for all x,y € X, either 
wcyory<@. 


4.36. Example. Let X = {1,2,...,n} and take < to be the usual ordering of integers. 
Then (X, <) is an n-element totally ordered poset. More generally, for any S C R, (S,<) is 
a totally ordered poset. 


4.37. Example: Boolean Posets. Let S be any set, and let X = P(S) be the set of 
all subsets of S. Then (X,C) is a poset, where A C B means that A is a subset of B. In 
particular, (P({1, 2,...,n}), C) is a poset of size 2”. This poset is not totally ordered when 
n>. 


4.38. Example: Divisibility Posets. Consider the divisibility relation | on N* defined 
by a|b iff b = ac for some c € N*. Then (Nt, ]) is an infinite poset. Given a fixed positive 
integer n, let X be the set of all divisors of n. Restricting | to X gives a finite poset (X, |). 
This poset is a totally ordered set iff n is a prime power. 


The next result shows that partial order relations and strict order relations are essentially 
equivalent concepts. 
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4.39. Theorem: Partial Orders vs. Strict Orders. Let X be a set, let P be the set 
of all partial order relations on X, and let S be the set of all strict order relations on X. 
There is a canonical bijection between P and S. 


Proof. Let A = {(a,x) : x € X} be the “diagonal” of X x X. Define f : P — S by setting 
f(<) =<~ A for each partial ordering < on X. Define g : S > P by setting g(<) =< UA 
for each strict ordering < on X. In terms of the digraphs, f removes self-loops from all 
vertices and g restores the self-loops. It is an exercise for the reader to show that f does 
map P into S, g does map S into P, and f og and go f are both identity maps. O 


DT 


4.9 Mobius Inversion for Posets 


4.40. Definition: Matrix of a Relation. Let X = {21,22,...,¢} be a finite set, and 
let R be a relation on X. Define the matrix of R to be the n x n matrix A = A(R) such 
that A;,; = x(a;Rx,;). A(R) is the adjacency matrix of the digraph (X, R). 


4.41. Theorem. Let < be a partial ordering of X = {21,...,a,}, and let < be the 
associated strict ordering of X (see 4.39). Consider the matrices Z = A(<) and N = A(<). 
Then Z =1I+N; N is nilpotent; Z is invertible; and 


Z+=I-N+N?—Ne+...4(-1)7 N™!, (4.4) 


Proof. The matrix identity Z = I + N holds since (X,<) is obtained from (X,<) by 
adding self-loops at each x € X. Next, we claim that the digraph (X,<) is acyclic. For if 
(21, Z2,---, 2k, 21) were a directed cycle in this digraph, we must have 21 < z2 <--++ < 2, < 
z,. Then transitivity gives z; < 21, which contradicts irreflexivity. By 3.24, N is nilpotent. 
The statements about the inverse of Z now follow from 3.25, taking A there to be -N. O 


4.42. Definition: Mobius Function of a Finite Poset. Keeping the notation of the 
preceding theorem, define pp = p(x,<) : X x X — Z by setting p(a;,x;) to be the 2, j-entry 
of Z~+. The function pu is called the Mébius function of the poset (X,<). 


4.43. Example. Let X = {1,2,3,4} with the usual ordering. For this poset, we have 


1111 0111 
0111 001 1 
SATAN ect ale, P= A gear We 
000 1 00 0 0 
The powers of N are 
0 0 1 2 00 0 1 
2 {| 0 0 0 1 3 {| 0 0 0 0 4 
ae 00 0 0 |’ ies 00 0 0 |’ Alas 
00 0 0 0 0 0 0 
So the inverse of Z is 
1 -l1 0 0O 
Saif 2 _f{ 0 1 -1 0 
Z =I-N+N*-N3= 0 0 fe aed 
0 0 O 1 


So u(t,7) = 1, w(t,¢ +1) =—-1, and p(t, 7) = 0 for all 7 421,041. 
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4.44. Example: Mobius Function of a Totally Ordered Poset. The preceding exam- 
ple generalizes as follows. Let X = {1,2,...,n} with the usual ordering. We have Z;,; = 1 
fori <j and Z;,; = 0 fori > j. For all 2, let Mj; = 1, Miia. = —1, and M;,,; = 0 for 
j #i,t+1. A routine matrix calculation shows that 7M = MZ =I. So for this poset, 


wi,)=1, w+ 1) =-1, wij) <0 for j #4441. 


4.45. Example: Mébius Function for Boolean Posets. Consider the poset (X,C), 
where X consists of all subsets of [n] = {1,2,...,n}. In this example, we will index the 
rows and columns of matrices by subsets of [n]. For $,T C [n], the S,T-entry of Z is 1 
if S C T, and 0 otherwise. We claim that the inverse matrix M = Z~! has S,T-entry 
u(S,T) = (—1)|7~S! if S C T, and zero otherwise. To verify this, let us show that ZM = I. 
The S,T-entry of ZM is 


S> 2S, UMU,T)= So (-1lFeel. 

UC{n] U:SCUCT 
If S = T, this sum is 1; while if S Z T, this sum is 0. Now consider the case where S € T. 
Let S have a elements and T’ have a+ b elements, where b > 0. For 0 < c < b, the number 
of sets U with S CU C T and |T ~ U| = cis Go) = @: Grouping terms in the sum 
based on the size of |T ~ U|, we see that 


b 
(ZM)(S,T) = yen(?) =(-1+1)?=0. 


c=0 . 
So the Mobius function for this poset is 
w(S,T) = (-1)PS\(SCT) (8,7 ¢ [n)). 
An alternate proof of this formula will be given in 4.58 below. 


4.46. Example: Mobius Function for Divisibility Posets. Let n be a fixed positive 
integer, let X be the set of positive divisors of n, and consider the divisibility poset (X, |). 
There is a close relation between the classical Mobius function 4 and the Mobius function 
ux for this poset. More precisely, we claim that 


p(d) = px(1,d) for all d dividing n. 


To verify this, let us work with matrices whose rows and columns are indexed by the 
positive divisors of n, considered in increasing order. As above, let Z be the matrix such 
that Za, = x(dle); let M be the inverse matrix, which is uniquely determined by Z; and 
let v be the row vector (u(d) : dln). The identity )7 qj, #(d) = x(m = 1), which is valid for 
all m dividing n, can now be rewritten as the vector identity JZ = (1,0,...,0). This shows 
that ¢ must be the first row of M. It will be shown in 4.59 that jx (d,e) = u(e/d) whenever 
dle and e|n, whereas j1x(d, e) = 0 if d does not divide e. 


The next definition will be used to give a combinatorial interpretation for the values of 
the Mobius function. 


4.47. Definition: Chains in a Poset. Let (X,<) be a poset. A chain of length k in X is 
a sequence C' = (zo, 21,---, 2%) of elements of X such that 


Zo < 24 << Bm. 


We say that C is a chain from zp to z, and write len(C) = k. The sign of the chain C is 
sen(C) = (-1)*. 
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4.48. Theorem: Mébius Functions and Signed Chains. Let (X,<) be a finite poset. 
Given y, z € X, let S be the set of all chains in X from y to z. Then 


L(X,<) (y, 2 = >. sgn(C 
ces 


In particular, if y £ z, then px,<)(y, 2) = 0. 
Proof. We know from (4.4) that 


ux,<)(¥,2) = So(-1)*N*y, 2), 


k>0 


where N is the adjacency matrix of the digraph G = (X,<). A chain of length k from y to 
z is the same as a walk (or path) of length & from y to z in G. By 3.18, the number of such 
walks is N*(y, z). The theorem now follows from the sum rule. Oo 


4.49. Theorem: Moébius Inversion Formula on Posets. Let (X,<) be a finite poset 
with Mobius function yp. Suppose R is a commutative ring and f,g : X — R are two 
functions. Then 


Va € X,9(z) = S> f(y) iff Va € X, f(z) = So g(y)u 


you yu 


Proof. Let X = {x1,...,@n}, and define Z = A(<) and M = Z~! as in 4.41. Also define 
row vectors F = [f(x1),...,f(@n)] and G = [g(#1),...,g(an)]. The left-hand formula in 
the theorem is equivalent to the matrix identity G = FZ, since G; = g(x;) and 


(FZ); = 30 FeZng =~ f(ae)x(te <2;)= >> Fly) 
k=1 k=1 


YS; 


Similarly, keeping in mind that pu(y,v) = 0 unless y < a, the right-hand formula in the 
theorem is equivalent to the matrix identity F = GM. Since M and Z are inverse matrices, 
G = FZ is equivalent to GM = F. O 


4.50. Example. In the special case where X = {1,2,...,n} with the usual ordering, 
4.49 reduces to the following statement: given f1,..., fn © R and gi,...,9n € R, we have 
(9 =fi+ fot+--++ fi for all ¢) iff (fi: = gi and fi = gi — gi-1 for 1 <i<n). 


4.51. Example. In the special case where X is the set of divisors of n ordered by divisibility, 
4.49 reduces to the classical inversion formula 4.30, using the fact that ux(d,e) = u(e/d) 
when dle, and ux (d,e) = 0 otherwise. 


4.52. Example. In the special case where X = P([n]) ordered by containment of subsets, 
4.49 reduces to the following statement: 


VT ¢ | Sys iff VF ¢ [n], f(T) = S01)? ~*lg(s) 
SCT SCT 
If instead we use the “opposite” poset (X, 2), one obtains: 


VT C[ = 8 iff VT C [n], f(T) = 57 1)8""9(8) 
SDT SDT 
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We now use this result to rederive a version of the original inclusion-exclusion formula. 
Let Z,...,Z, be given subsets of a finite set Z. For S C [n], let f(.S) be the number of 
objects z € Z such that z € Z; if and only if 7 € S. For S C [n], let g(S) be the number 
of objects z € Z such that z € Z; ifi € S. Regarding Z; as the set of objects in Z with a 
certain property 7, we can say that f(S) counts objects that have exactly the properties in 
S, whereas g(S') counts the objects that have at least the properties in S. It follows from 
this that g(T) = 7657 f(S) for all T, so 4.49 tells us that f(T) = 055.-(—1)/8~7"g9(S) for 
all T. Now, f(0) =|Z ~ (Z,U---UZ,)| and g({i1,...,in}) = |Zi, N---NZ;,|. The formula 
in 4.11 follows from these observations. 


4.10 Product Posets 


This section introduces a construction on posets that leads to alternative derivations of the 
Mobius functions for the posets (P([n]),C) and ({d: dn}, |). 
4.53. Definition: Product Posets. Let (X1,<1),...,(Xn,<n) be posets. Consider the 
Cartesian product X = X, x --- x X,, which consists of all n-tuples x = (a1,...,%n) with 
xu; © X;. For « = (a;) and y = (y;) in X, define x < y iff a; <; y; for 1 <i <n. One 
immediately verifies that < is a partial ordering on X. The poset (X, <) is called the product 
of the posets (Xj, <;). 
4.54. Example. Let X; = X2 = {1,2} with the usual ordering. Both X; and X2 are 
totally ordered posets, but X = X, x X2 is not totally ordered. For example, (1,2) and 
(2,1) are two incomparable elements of X. 
4.55. Theorem: Mobius Function for a Product Poset. Let (X,<) be the product 
of posets (X;,<;) for 1 <i <k. Given x = (z;) and y = (y;) in X, we have 
k 
ucx,<)(a,y) = ][ ocx.<y (eis i). 
i=1 
Proof. For brevity, write = Mcx,<) and (4; = Hx,,<,)- By induction, we can reduce to the 
case k = 2. We have the matrices 
A= Ly (u1 <Y v1) 2U1,U1 € Xi], M, = [101 (1, V1) 2U1,U1 € Xi], 
Zo = [x(u2 <2 v2) : U2, v2 € Xe], Me = [p2(ue, ve) : ua, ve © Xo], 


Z=([x(u<v):u,veE X], M = [u(u,v) : u,v € X], 
which satisfy Z7,M, = I, Z2.Mz = I and ZM = I. Define a matrix M’, with rows and 
columns indexed by elements of X, such that for u = (ui,u2) and v = (v1,v2) in X, 


the u,v-entry of M" is 41 (ur, v1) Me2(u2, v2). Note that the u,v-entry of Z is y((u1,u2) < 
(v1, v2)) = x(ur <1 V1)xX (ug <2 vg). The following computation verifies that 7M’ = I, and 
hence M’ = M: 


(ZM’)(u, w) 


S- Z(u,v)M’(v, w) 


vex 


a S- » X(ur <1 V1)X(U2 <2 v2) {1 (V1, W1)p2a(ve, We) 
v1EX1 v2EX2 


= ( iD x(u1 <1 niatonn)) ; ( 2 xX (U2 <2 natea)) 


VEX voE Xo 
= x(u1 = wi)x(u2 = we) =xX(u=w). O 
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4.56. Definition: Poset Isomorphisms. Given posets (X,<) and (X’,<’), a poset iso- 
morphism is a bijection f : X — X’ such that 


u<v iff f(u) <' fv) (u,v € X). 
4.57. Theorem. If f : (X,<) > (X’,<’) is a poset isomorphism, then 


mx <(f(u), f(v)) = wex,<)(u,v) (u,v € X). 


Proof. This follows, for instance, from 4.48. For, the chains of a given length from wu to v in 
(X, <) correspond bijectively to the chains of that length from f(u) to f(v) in (X’, <’); we 
merely apply f to each element in the chain. O 


4.58. Example: Moébius Function of a Boolean Poset. Consider again the poset 
X = (P([n]),C). For 1 <i <n, take Y; = {0,1} with the usual ordering, and let Y = 
Y, x +++ Y, be the product poset. There is a bijection f from P([n]) to {0,1}” that sends 
a subset S to the word f(S) = w = wiwe--:w, with w; = 1 fori € S and w; = 0 fori ¢ S. 
One readily sees that f is a poset isomorphism, so px (5,T) = puy(f(S), f(T)). Writing 
f(T) = 2 = 2129-++ Zn, 4.55 shows that py(w, z) = []F_, wy; (wi, z:). As in 4.44, we see that 


py; (0,0) = wy,(1,1)=1; py,(0,1)=—1; py,(1,0) = 0. 


So py (w, z) = 0 unless w < z. If w < z and z has k more ones than w does, we see that 
py (w, z) = (—1)*. Translating back to subsets via f—!, this says that yx (S,7) = 0 when 
S ZT, and px(S,T) = (-1)!7~5! when $C T. 


4.59. Example: Mobius Function of a Divisibility Poset. Let n be a fixed positive 
integer with prime factorization n = pj! ---p;,", and consider the divisibility poset (X,|), 
where X = {d: d|n}. For1 <i<k, let Y; = {0,1,...,;} with the usual ordering, and take 
Y to be the product poset Y, x --- x Y,. Any d € X has prime factorization d = po . pak 
for some dy < nz. The map d+> (dj,...,d,) is readily seen to be a poset isomorphism from 
X to Y. So 


k 
tx (d,e) = py ((d,..., dg), (€1,---,€k)) = [To (d;, e:). 


As in 4.44, we see that py, (di, e:) = x(ei = di) — x(e; = dj +1). It follows that ux (d,e) = 0 
unless e is obtained from d by multiplying by a set of s distinct prime factors chosen from 
{pi,---;pPr}, in which case wx (d,e) = (—1)*. It is now routine to check that whenever dle, 
px (d,e) = w(e/d), where p is the number-theoretic Mébius function. 


Summary 


e Involutions. An involution is a function I: X — X with Jol = idx. The fixed point set 
of I is Fix() = {a € X : I(x) = x}. If X consists of signed objects, I is sign-reversing 


iff sgn(J(a)) = —sgn(a) for all  € X ~ Fix(J). For a sign-reversing involution J with 
domain X, 

S- sen(x) = S- sen(x). 

rex «€Fix(I) 


Involutions provide combinatorial proofs of identities that involve signed terms. 
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e Inclusion-Exclusion Formulas. For arbitrary finite sets S),...,5y, 
n 
|S1 U S2U++»USp| = S>(-1)*"! S- Su, Sig Me M Sig | 
k=1 1S iy <ig<s<ip <n 


If each S; is a subset of a finite set X, then 


IX~(S1U--US~))= So (1)! 


IC{1,2,...,n} 


Ms 


161 


9 


where the summand indexed by I = @ is interpreted as |X|. In the special case where 
Nie " S;| = N(k) for all k-element subsets J, the formula simplifies to 


|X ~ (51U---U Sp) = |X] + S0(-F (7) ve. 
k=1 


e Surjections and Stirling Numbers. For m > n > 1, there are eer) (n — k)™ 
surjections from an m-element set onto an n-element set. A summation formula for the 
Stirling number of the second kind is 


: (n—k)™ 


k=0 : 


e Euler’s @ Function. For m > 1, ¢(m) is the number of positive integers x < m with 
gcd(z,m) = 1. We have ¢(m) = mJ]... — p') where the product ranges over all 
prime divisors p of m. For m = q® with q prime, $(q°) = q° — q°!. If ged(m,n) = 1, 
then ¢(mn) = ¢(m)d(n). For m > 1, d1 aim O(a) =m. 


e Derangements. A derangement of S is a bijection f : S > S with f(x) £2 for alla € S. 
Let d, be the number of derangements of an n-element set. Then dy, =n! S7;_)(—1)*/k! 
is the closest integer to n!/e. Moreover, the numbers d,, satisfy the recursions 


dn = (n = 1)dp—1 + (n — 1)dp_2 (n > 2); 


dn = ndy—1 + (—1)” (n > 1). 


© Coefficients of Chromatic Polynomials. For a simple graph G, the coefficient of «° in 
the chromatic polynomial yg(x) is }>.s9(—1)°n(e,c), where n(e,c) is the number of 


subgraphs H of G such that V(H) = V(G), |E(H)| = e, and H has c connected 
components. 


e Number-theoretic Mébius Function. Define  : Nt > {-1,0,1} by p(n) = (-1)% ifn 
is the product of s > 0 distinct primes, and p(n) = 0 otherwise. Then aw u(d) = 
x(m = 1). Given functions f and g such that f(m) = 74), 9(@) for m > 1, the classical 
Mobius inversion formula states that 


g(m) = S> f(m/d)u(d) = S> f(@u(m/d) (m= 1). 
d|m d|m 


It follows that ¢(m) = dla, H(d)m/d. 
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e Posets. A partial ordering of X is a relation < on X that is reflexive, antisymmetric, 
and transitive; the pair (X,<) is called a poset. A strict ordering of X is a relation < 
on X that is irreflexive and transitive. There is a bijection between partial orders on X 
and strict orders on X defined by removing the diagonal {(z,x2): a € X}. A chain of 
length k in a poset (X,<) is a sequence (Zo, 21,.--, 2k) with zo < 21 <-+: < zm. Sucha 
chain goes from z to z, and has sign (—1)*. 


Mobius Functions for Posets. Given a poset (X = {21,...,@n},<), define n x n matrices 
Z, N, and M by Z,; = x(a; < 2;), Niy = x(ai < 2), and Mj; = the signed sum of all 
chains in the poset from x; to z;. Then Z =I +N; N is nilpotent; and M is the matrix 
inverse of Z. We write fx (x;,x;) = Mj; and call p the Mébius function of the poset 
(X,<). Suppose f and g are functions with domain X. The Mébius inversion formula 
for posets states that 


g(x) = Ss" fly) for all a € X iff f(x) = S a(y)ux(y, 2) for alla € X. 


ySau ySa 


Product Posets. Given posets (X;, <;) for 1 <7 <n, the product set X = X, x---x Xn 
becomes a poset by defining (71,...,%n) < (yi,---,Yn) iff a; <; y; for all 7. The Mobius 
function for the product poset satisfies 


Mx ((21, hose yn) (My, ai -+Yn)) = [ex im). 


Examples of Mobius Functions. The poset X = {1,2,...,n} with the usual total ordering 
has Mobius function 


px(i,i)=1, ux(,i+1 =-1, px(i,j) =0 for j i641. 


The Boolean poset (P(X), C) of subsets of {1,2,...,n} ordered by inclusion has Mébius 
function 

(ST) =(-1P*Fx(SCT) — ($,T Cn). 
If N has prime factorization p}?---p;*, then the poset of divisors of N under the 


divisibility ordering has Mobius function 


Ge (—1)*° if e/d is a product of s distinct primes; 
aa ae 0 otherwise. 


These results follow since the Boolean poset is isomorphic to the product of n copies of 
the totally ordered set {0,1}, whereas the divisiblity poset is isomorphic to the product 
poset {0,1,...,ni} x--- x {0,1,...,nx}. 


(ie 
Exercises 


4.60. Given that |S] = 15, |T| = 13, |U| = 12, |SAT| =6, [SAU] =3, |TNU| =4, and 
ISAT OU| =1, find: (a) |S UT]; (b) |S UT UU]; (c) the number of objects in exactly one 
of the sets S$, T, U. 
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4.61. Given that S, T, U are subsets of X with |X| = 35, |S| = 12, |T| = 14, |U| = 15, 
ISOT|=5 = |SNU|, |TOU| = 6, and |(SUT)NU| = 9, find: (a) |SAT NU]; (b) 
|X ~ (SUT UU))|; (c) the number of objects in exactly two of the sets S, T, U. 


4.62. List all the derangements in D4. 


4.63. Compute dio in four ways: (a) by rounding 10!/e to the nearest integer; (b) by using 
the summation formula 4.23; (c) by using the recursion in 4.24; (d) by using the recursion 
in 4.25. 


4.64. Compute ¢(n), u(n), T(n), and a(n) for the following choices of n: (a) 6; (b) 11; (c) 
28; (d) 60; (e) 1001; (f) 121. 


4.65. Verify 4.34 by direct calculation for (a) m = 24; (b) m = 30. 


4.66. Given n married couples, how many ways can the n men and n women be paired up 
so that no pair consists of a man and his wife? 


4.67. How many five-card poker hands have at least one card of every suit? 


4.68. How many five-card poker hands have at least one face card, at least one diamond, 
and do not contain both a 2 and a 3? 


4.69. How many ten-digit numbers contain at least one 4, one 5, and one 7? 


4.70. How many bridge hands are void in clubs and have at least one card of value p for 
each prime p < 10? 


4.71. How many surjections f : {1,2,...,m}— {1,2,...,n} have the property that f(x) = 
1 for exactly one x < m? 


4.72. (a) What is the chromatic polynomial for the 4-cycle C4? (b) For each coefficient 
of this chromatic polynomial, draw the vertex-spanning subgraphs of Cy, counted by that 
coefficient. 


4.73. For even n > 2, determine the number of integers x < n with gcd(a,n) = 2. 


4.74. For k > 0 and m > 1, let ox(m) = Yam d*. (a) Find a formula for o,(m) in terms 


of the prime factorization of m. (b) Find a formula for m* 


involving o, and pu. 
4.75. Use 4.20 to show that (mn) = ¢(m)d(n) iff gcd(m,n) = 1. 


4.76. Explicitly compute how the first involution discussed in 4.6 matches up the 24 objects 
counted by Wy s(4,k) into pairs of objects of opposite sign. 


4.77. Suppose w has cycles (1), (2), (3,8, 7), (5,6,9), (4), and 


U = {{(1)}, {(2)}, {(4), (5, 6, 9)}, (3, 8, 7) }}- 
Compute I(w,U), where I is the involution defined at the end of 4.6. 


4.78. Consider the derangement w = 436215 € Dg. Find the six derangements in D7 and 
the seven derangements in Dg that can be built from w by the construction in the proof 
of 4.24. 


4.79. Use the recursion for derangements in 4.25 to give a proof by induction of the sum- 
mation formula for derangements in 4.23. 
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4.80. Give the details of the proof of 4.39. 


4.81. Show that if G is a simple graph with c connected components, then the chromatic 
polynomial yg(x) must be divisible by x°. 


4.82. (a) Give an algebraic proof that )>;_, (1)2*(—1)""* = 1 for n > 0. (b) Prove the 
identity in (a) using an involution. 


4.83. For integers a > b > 0, evaluate 7/_, (%)a”-*(—b)* by using an involution. 


4.84. For integers 0 <a <b <c, evaluate owes by using an involution. 


4.85. Let S C T be given finite sets. (a) Use an involution to prove Ty. gscycp(—L)!7~U! = 


x(S = T) (cf. 4.45). (b) In a similar manner, evaluate }°,,. scucr(-1)U~s. 


4.86. Let d,e € Nt with dle. Use an involution to prove )>,. ajbje H(e/k) = x(d = e). 
Interpret this result in terms of the Mobius function of a poset. 


4.87. Count the nxn matrices A with entries in {0, 1,2} such that: (a) no row of A contains 
all zeroes; (b) every column of A contains at least one zero; (c) there is no index j with 
A(i,j) > 0 and A(j,7) > 0 for all 7. 


4.88. An arrowless vertex in a simple digraph D is a vertex with indegree and outdegree 
zero. How many simple digraphs with vertex set {1,2,...,n} have no arrowless vertices? 


4.89. An isolated verter in a simple digraph D is a vertex v such that there is no edge 
(u,v) or (v,u) in D with u 4 v. How many simple digraphs with vertex set {1,2,...,n} 
have no isolated vertices? 


4.90. How many simple graphs with vertex set {1,2,...,n} have no isolated vertices? 
4.91. Use 4.11 to compute the chromatic polynomial of the paw graph (see 3.124). 


4.92. (a) How many anagrams in R(132?---n*) never have three equal letters in a row? 
(b) How many anagrams in R(1*2* ---n*) never have k equal letters in a row? 


4.93. (a) Count the permutations w of {1,2,...,n} such that wi41 4 w; +1 for alli <n. 
(b) Express your answer to (a) in terms of the derangement numbers dx. 


4.94. Given sequences 0 < ay < ag <+:-< ay < AandO0 <b) < bg <--- < by < B, use 
inclusion-exclusion to derive a formula for the number of lattice paths from (0,0) to (A, B) 
that avoid all of the points (a;,b;) for 1 <i<k. 


4.95. Recursion for Mébius functions. (a) Show that the Mobius function of a poset 


(X, <) can be computed recursively via (x, 2) = — Diy. 9<y<z H(a,y) for x < z, with initial 
conditions y(z,x2) = 1 and p(x, z) = 0 whenever x £ z. (b) Show that the Mébius function 
also satisfies the recursion p(x, z) = — D0). n<y<z H(Y, 2) for @ < z. 


4.96. Poset Associated to a DAG. Suppose G = (X, R) is a DAG. Prove that there 
exists a unique smallest irreflexive, transitive relation < that contains R. The corresponding 
poset (X,<) is called the poset associated to the DAG G. 


4.97. Let (X,<) be the poset associated to the DAG 
({a, b, Cc, d, ets {(a, b), (0, e) (a, ls Cc, By (a, d), (d, e)}). 


Compute the Mobius function zx in two ways, by: (a) inverting the matrix Z; (b) enumer- 
ating signed chains in (X, <). 
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4.98. Let (X,<) be the poset associated to the DAG 


({a,b, c,d, e, f}, {(a, 6), (a, 6), (6,4), (0, €); (c,d), (df), (es f)}). 


Compute the Mobius function fzx in two ways, by: (a) inverting the matrix Z; (b) enumer- 
ating signed chains in (X, <). 

4.99. A subposet of a poset (X,<) is a poset (Y,<’), where Y is a subset of X, and for 
a,b€ Y,a<' biffa < b. An interval in X is a subposet of the form [x,z] = {ye X:a< 
y < z}. Show that for all a,b,c,d € X, if the intervals [a, b] and [c, d] are isomorphic posets, 
then wx (a,b) = ux(c, d). 

4.100. Assume that X, and X92 are finite disjoint sets. The disjoint union of the posets 
(X1, <1) and (Xo, <2) is (X,<) where X = XU Xo and for a,b e€ X,a< biffa,be Xi 
and a <; b, or a,b € X2q and a <2 b. Determine jux in terms of fix, and pux,. 


4.101. Given a poset (X,<), define a new poset (Y, <’) by setting Y = X U {0} (where 0 is 
a new symbol not in X), and letting <’ be the extension of < such that 0 <’ y for ally € Y. 
Informally, (Y, <’) is obtained from (X,<) by adjoining a new least element. Determine py 
in terms of ux. 


4.102. Given posets (X1, <1) and (Xo, <2) where X; and X92 are finite disjoint sets, define 
a new poset (X,<) by setting X = X,U X, and, for a,b€ X,a< biffa,be X; anda <;b 
(i =1,2), ora € X; and b € Xo. Informally, (X,<) is obtained from X, and X_ by making 
everything in X, less than everything in Xj. Determine jzx in terms of x, and x,-. 


4.103. Let S1,...,S, be any events in a sample space X with probability measure P. State 
and prove an analogue of 4.7 that can be used to compute P(S; U---US)). 


4.104. Let S;,...,5;, be independent events in a sample space X (see 1.84). Prove that 
for 1 <i<n, the events $1, 52,...,X ~ Sj;,...,S, are independent. 


4.105. Let S1,...,.5;, be independent events in a sample space X, with P(.S;) = p; for each 
i. Find the probability that none of the events $; occurs: (a) using inclusion-exclusion and 
the generalized distributive law; (b) using 4.104. 


4.106. Use an involution to prove that for all i,n € N, per eo ) CS) = x(i = 0). 


4.107. Use an involution to prove that for 0 < k <n, 17_,(-L*-*(") (j)2"-* = (3). 
4.108. Prove that for all n,j > 0, n? = 7 o(—1)F-F RIS, k) Cae, 


4.109. For n > 0, evaluate 77-5 (—1)*(")(n — k)”. 


4.110. Use an involution to prove the following identity satisfied by Catalan numbers: 
i n+1-k 
Cr = Micke (nt1/2(-)* "Cn cjel a i 


4.111. Let A be an n x n matrix with A(i,j) = Ga for 1 < i,j <n. (a) Look at small 
examples to guess a formula for A~+(i,7). (b) Prove your guess using an involution. 


4.112. How many bijections f : {1,2,...,n} — {1,2,...,n} are such that the functional 
digraph of f contains no cycle of length k? 


4.113. How many anagrams in R(a°b%c3d?) never have two consecutive equal letters? 


4.114. Prove or disprove: for every integer y > 1, there exist only finitely many integers 
x >1 with d(x) = y. 
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4.115. How many compositions of n have k parts each of size at most m? 


4.116. Call a function f : X — Y doubly surjective iff for all y € Y, there exist at least two 
x € X with f(x) = y. Count the number of doubly surjective functions from an m-element 
set to an n-element set, where m > 2n. What is the answer when m = 11 and n = 4? 


4.117. (a) Let 5),...,5, be subsets of a finite set X. Prove that the number of elements 
of X that lie in exactly k of the sets S; is 


eee S- [eae oye, 


i=0 1<ji<jo<-<jrpi<n 


(b) Find and prove a similar formula for the number of elements of X that lie in at least k 
of the sets Sj. 


4.118. For0 <k <n, let d,,, be the number of permutations of n objects that have exactly 
k fixed points. (a) Use 4.117 to find a formula for d,,,. (b) Give algebraic and combinatorial 
proofs that dn.~ = (dae 


4.119. How many integers between 1 and 2311 are divisible by exactly two of the primes 
in {2,3,5,7}? (Use 4.117.) 


4.120. Let (F;,) be the Fibonacci sequence (Fo = 0, Fi = 1, Fn = Fn-1+ Fr-2 for n > 2). 
Find a formula for )7;7_)(—1)* Fy and prove it, either algebraically or using an involution. 


4.121. Find and prove a formula for }7/¢_9(—1)* Fe Fr—x- 
4.122. For each integer x > 1, evaluate S77_, w(k)v/k]. 
4.123. For n > 0, evaluate )7;'_,(—1)* Surj(n, k). 

4.124. For n > 0, evaluate S777} (—1)*(k — 1)!S(n, k). 


4.125. Consider an n x n lower-triangular matrix A such that A(n, k) is the number of Dyck 
paths ending with exactly k east steps, for 1 < k < n. Find a combinatorial description of 
A~', and prove that this is the inverse of A using an involution. 


4.126. Garsia-Milne Involution Principle. Suppose J and J are involutions defined on 
finite signed sets X and Y, respectively. Suppose f : X — Y is a sign-preserving bijection, 
ie., sen(f(x)) = sgn(x) for all 2 € X. Suppose also that every object in Fix(I) and Fix(J) 
has positive sign. Construct an explicit bijection g : Fix(1) — Fix(J). 


4.127. Bijective Subtraction. Suppose A, B, and C are finite, pairwise disjoint sets and 
f:AUB-— AUC is a given bijection. Construct an explicit bijection g: B— C. 


4.128. Bijective Division by Two. Suppose A and B are finite sets. Given a bijection 
f : {0,1} x A — {0,1} x B, can you use f to construct an explicit bijection g: A — B? 


4.129. In §4.1 we proved combinatorially that 5°, s(i,k)S(k,7) = x(i = j). Can you find 
a combinatorial proof that >7, S(i,k)s(k,7) = x(¢ = J)? (Compare with 2.77(d).) 


n 


4.130. Find a bijective proof of the derangement recursion d, = ndyj—1 + (—1)”. 


4.131. Let X,, be the set of set partitions of {1,2,...,n}. Define the refinement ordering 
on X,, by setting, for P,Q € Xn, P x Q iff every block S € P is contained in some block 
T € Q. (a) Show that (X,,X) is a poset. (b) Compute the Mébius function of this poset 


for 1 <n < 4. (c) Show that any interval [P,Q] in X,, (see 4.99) is isomorphic to a poset 
(Xx, ) for some k. (d) Compute jx, for all n. 
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Notes 


A thorough treatment of posets from the combinatorial viewpoint appears in Chapter 3 of 
Stanley [127]. See Rota [118] for one of the seminal papers on Mobius inversion in combi- 
natorics. A classic text on posets is the book by Birkhoff [12]. The Garsia-Milne involution 
principle in 4.126 was introduced in [49, 50]. For applications and extensions of this principle, 
the reader may consult the following sources [57, 73, 87, 88, 108, 109, 140]. An application 
of bijective subtraction (see 4.127) is presented in Loehr [85]. 
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Ranking and Unranking 


This chapter studies the notions of ranking and unranking from a bijective viewpoint. 
Intuitively, our goal is to find algorithms that implement bijective maps between an n- 
element set of combinatorial objects and the set of integers n = {0,1,2,...,n —1}. These 
algorithms will allow us to solve a variety of combinatorial problems. We begin the chapter 
by introducing some of these problems. Then we discuss bijective versions of the sum and 
product rules. These new rules provide a mechanical method for translating the counting 
arguments in earlier chapters into explicit bijections. Recursions derived using the sum and 
product rules can be treated in the same way; the resulting bijections are typically specified 
by recursive algorithms. 


DT 


5.1 Ranking, Unranking, and Related Problems 


Suppose S is a finite set of objects. We will study five fundamental combinatorial problems 
involving the set S: counting, listing, ranking, unranking, and random selection. 


1. Counting. The counting problem asks us to compute the number of elements in the finite 
set S. We have already discussed tools to solve the counting problem in Chapter 1. 


2. Listing. The listing problem asks us to list all the elements in the set S exactly once. 
There are many possible lists for a given set S$; usually we produce lists that present the 
objects of S in a special order. Examples of such orderings are lexicographic orderings 
and Gray code orderings. 


3. Ranking. Suppose we have specified a particular ordering for listing the elements of S. 
Given an object x € S, the ranking problem asks us to calculate the position (or rank) 
of x on the list without actually listing all the objects preceding (or following) x on the 
list. It is often convenient to number the positions on the list starting with position 0. 


4. Unranking. Suppose we have specified a particular ordering for listing the elements of 
S. Given an integer m with 0 < m < |S|, the unranking problem asks us to calculate 
the object x in S that occupies position m on the list, without actually generating the 
whole list. 


5. Random Selection. The random selection problem asks us to devise a way to choose a 
random element of S, where “random” means that each element of $ is equally likely to 
be chosen. We assume that we are given a device that will produce random real numbers 
in the interval [0,1]. Alternatively, we can assume that we have a device that, given a 
positive integer n, randomly picks an integer in the set n = {0,1,...,n— 1}. 


If we can solve the listing problem for S, then we can (in principle) solve the counting 
problem, the ranking problem, and the unranking problem by simply writing down the whole 
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list. However, we usually desire more efficient methods for counting S', ranking objects, and 
unranking integers. Also, this method is impractical if we are studying not just one finite 
set S but a whole infinite family of such sets. 

Similarly, if we can solve the counting problem and unranking problem for S, then we can 
solve the random selection problem as follows. Use the random number generator described 
above, with n = |S|, to get a random integer between 0 and |S|— 1. Unrank this integer to 
get the random object in S. Note that if |.S| is very large, it may be difficult to implement a 
random number generator that uniformly chooses integers in the desired range. Thus, even 
if we can count S and unrank elements of S, it is valuable to find other ways to solve the 
random selection problem that avoid the random selection of integers in a huge range. 

In bijective combinatorics, we try to solve the five enumeration problems posed above by 
constructing explicit bijections. For example, the ranking problem amounts to construct- 
ing a bijection r : S — n with specified properties. The unranking problem amounts to 
constructing the inverse bijection u: n — S. The list of elements of S determined by the 
unranking map uw is the sequence (u(0), u(1),...,u(m—1)). 

Here and below, “constructing” a map h: A — B means giving an algorithm that takes 
as input an element a € A and produces as output the corresponding element h(a) € B. 
We must also prove that the proposed algorithm is indeed a bijection, and we would like to 
have algorithms that are as efficient as possible. Note that knowing an algorithm to compute 
a bijection h : A — B is not the same as knowing an algorithm to compute the inverse 
bijection h~! : B — A. We say that we have solved the counting problem for S' bijectively 
if we have algorithms to compute both a bijection u: n — S and its inverse r: S — n. 
Observe that the bijective counting problem is harder than the original enumerative counting 
problem: if S is complicated, we may be able to determine that |S] = n without constructing 
any explicit bijections between S and n. By definition, saying that |S| = n means that such 
bijections exist. But knowing this abstract existence statement is much weaker than actually 
giving constructions and algorithms that implement particular bijections. 

As noted above, each bijection u : n — S provides one solution to the listing problem 
for S. If we are asked to list elements of S in a particular order, then we must construct an 
appropriate bijection u such that the list (u(0), u(1),...,u(m— 1)) contains the objects in 
S in the desired order. It is sometimes desirable to have an auxiliary successor algorithm 
that, given an object u(i) in the list, outputs the next object u(¢+1) without ever explicitly 
computing 7. We can then list the objects in S by starting with u(0) and repeatedly invoking 
the successor algorithm. We will consider the construction of successor algorithms at the 
end of this chapter. 


DT 


5.2 Bijective Sum Rule 


We begin our study of ranking and unranking by revisiting the fundamental counting rules 
from Chapter 1. Our first rule lets us assemble ranking (resp. unranking) maps for two 
disjoint finite sets to obtain a ranking (resp. unranking) map for the union of these sets. 
Throughout this chapter, the notation n will be used to denote the set {0,1,...,2— 1}. 


5.1. Bijective Sum Rule for Two Sets. Let S and T be disjoint finite sets. 
(a) Given bijections f : S — nand g: T — m, there is a canonical bijection f+g:SUT > 


n+ m defined by 
_f F@ for x € S; 
(f +9)(x) = { g(a) +n for x ET. 
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(b) Given bijections f’ : n — S and g’ : m — T, there is a canonical bijection f’ + g’ : 
n+m-— SUT defined by 


roo _ f f(z) for0O<k<n 
COO Gen oa an 


(c) If f’ = f-* and g’ = g™", then f’+g' =(f+g). 


We leave the detailed verification of this rule as an exercise. The disjointness of S and 
T is critical when showing that f + g is a well-defined function and that f’ + g’ is injective. 
Observe that the order in which we combine the bijections makes a difference: the bijection 
f+g:SUT—n-+4m is not the same as the bijection g+ f: SUT — n+m. Intuitively, 
the ranking bijection f +g assigns earlier ranks to elements of S (using f to determine these 
ranks) and assigns later ranks to elements of T (using g); g++ f does the opposite. Similarly, 
the unranking map f’ + g’ generates a list in which elements of S occur first, followed by 
elements of T; g' + f’ lists elements of T first, then S. 

Iterating the bijective sum rule for two sets leads to the following general version of this 
rule. 


5.2. Bijective Sum Rule for k Sets. Suppose (51,...,.5;) is an ordered list of pairwise 
disjoint finite sets. Let S = bE, S; be the union of these sets. Let n; = |.S;| and n = |S| = 
Nytngte+++Nz. 

(a) Given bijections f; : S$; — nj for 1 <i < k, there is a canonical bijection f = ey Hes 
S — n defined by f(x) = fi(z) + 1 ,<; nj for x € Sj. 

(b) Given bijections f/: nj — 5; for 1 <i < k, there is a canonical bijection f’ = ye! fi: 
n — § specified as follows. For each x € n, there exists a unique i (1 <i<k) such that 
ny tes +nj-1<a<ngt+---+nj;. Define f’(x) = fi(a — [ni +--+ + ni-1]) € 5; CS. 

(c) If f/ = f;-' for each i, then f’ = f-!. 


uv 


As before, we leave the formal proof of the bijective sum rule to the reader. (One can 
give a direct proof that the maps in question are bijections, or use an induction argument 
involving the bijective sum rule for two sets to show that cy fi= ae fi + fr.) Tntu- 
itively, fi +---+ fp is the ranking map that assigns elements of 5S; to positions 0 through 
n, — 1 using fi, assigns elements of S2 to positions n; through ni + n2 — 1 using fa, etc. 
The unranking map fj +---+ f{ generates a listing of S in which objects in S; occur first, 
then objects in S2, and so on. 


5.3 Bijective Product Rule 


Now we consider bijective versions of the product rule, which generalize the familiar “base-b 
expansions” of natural numbers. Before introducing the bijective product rule, we recall the 
following theorem concerning integer division with remainder. 


5.3. Theorem: Integer Division. Suppose a is any integer and 6 is a positive integer. 
There exist unique integers q and r such that 


a=bq+rand0<r<b. 


Furthermore, there is an algorithm to compute g and r given a and b. The integers q and r 
are called the quotient and remainder when a is divided by b. 
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Proof. First assume a > 0. Consider the following algorithm. Define a9 = a and i = 0, 
and then loop as follows. If a; > b, define aji,; = a; — b > 0, then replace 7 by i + 1, 
and continue to loop. Otherwise, terminate with the answer g = 7 and r = a;. Now, this 
algorithm must terminate, since otherwise we would have an infinite strictly decreasing 
sequence ag > a, > a2 >--- of elements of N, which is impossible. An induction on 7 shows 
that a = bi+ a; for each 7 such that a; is defined. Therefore, when the algorithm terminates, 
we will indeed have a= bg+rand0<r<0b. 

If a < 0, use the preceding algorithm to compute qi,71 € Z with ja] = bq, +71 and 
0<r, < b. Then a = —bq, —11. If r; = 0, set g = —q, and r = 0; otherwise, set q = —1-—q 
and r = b— ry . One readily checks that a = bg +r and 0 <r < b. This completes the 
algorithmic proof of the existence of q and r. 

For uniqueness of gq and 7, suppose we have a = bg +r = bq’ +1’ where q,r,q',7r’ € Z 
and 0 < r,r’ < b. Rearranging the given equations, we see that b(q— q’') =r’ —r. The right 
side is an integer strictly between —b and b, whereas the left side is an integer multiple of 
b. The only such multiple of b is zero, so g =q/ and r=r’. oO 


5.4. Remark. The division algorithm used in the preceding proof is quite inefficient. In 
practice, one divides a by b using the “long division” algorithm (see 5.90). 


5.5. Theorem: Base-b Expansions. Let b > 1 bea fixed positive integer. For every integer 
a > 0, there exists a unique sequence (do, d1,d2,...) of integers satisfying the following 
properties: 0 < d; < 6 for all 2; all but finitely many d;’s are zero; and 


a = do + dyb+dgb? +--+ djbi +.» = So djb’. 
i>0 


We call (do,di,...) the base-b expansion of a, which can be written more concisely as 
[a], = ---dgdadydo. 


We only sketch the proof. The idea is to divide a by b repeatedly. The first remainder r is 
the last “digit” do; expanding the first quotient q in the same manner yields the remaining 
digits - - -dg3d2d,. Uniqueness follows by induction, using the uniqueness assertion in 5.3. We 
will give another proof of this result later, by using the bijective product rule to rank and 
unrank words in the product set b*. 

We can generalize the preceding discussion by allowing the base b to change at each 
step. This idea leads to the general version of the bijective product rule, presented below. 
First, we consider the simpler version of this rule involving two sets. 


5.6. Theorem: Bijective Product Rule for Two Sets. Suppose s and ¢ are positive 
integers, and n = st. The map p = ps4: 8 X t — n given by p(i,j) = it+ 9 is a bijection. 
To compute the inverse map p’(u) (where u € n), use division to write u = gt +r where 
0<r<t, and set p'(u) = (q,r). 


Proof. First we check that p does map s x t into n. Suppose 0 <i<sand0 <j <tare 
integers. Then 0 < it +7. Furthermore, since i < s— 1, we have it+ j < (s—1)t+j < 
(s — 1)t+t = st. Thus, p(t,j7) € st = n. Next we check that p’ does map n into s x t. 
Given u © n, the integers q and r in the definition of p’ are well defined, by the existence 
and uniqueness assertions in 5.3. The condition on the remainder in 5.3 assures us that 
r € t. Since g < 0 implies u < 0, while g > s implies u > st, and we are assuming that 
0<u<n = st, we conclude that 0 < q < s. Thus, g € s, so p’ does map into the set 
s x t. It is now routine to check that the maps p and p’ are indeed two-sided inverses of 
each other, so both maps are bijections. O 
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5.7. Remark. Note that the bijection p, ; just constructed depends on the order of s and 
t. More explicitly, p.4:s x t > n sends (i,j) to it + 7, whereas p,, : t x s > n sends (j, 1) 
to js +i. Let F:sxt—txs be the bijection given by F((i,7)) = (j,i). The preceding 
formulas show that p;,,0 F # ps for s € t, although both functions are bijections from 
s xt to st. 

5.8. Remark. The bijections in the product rule can also be built automatically using the 
bijective sum rule. In the sum rule, take S; = {i} x t for 0 <i < s, and let f, : 5; — t be 
the bijection defined by f;(2,7) = j for all 7 € t. The set s x t is the disjoint union of the 
S;’s, so the bijective sum rule furnishes a bijection f = ae fi: sxt— st. The formula 
for f gives f(i,7) = filt,d) + Mocnes [Si] = J + it = pst(t,7). Similarly, pz, 0 F can be 
obtained by adding up the bijections s x {j} — s sending (2, 7) to 7. Since the bijective sum 
rule guarantees the invertibility of f, we see that the division algorithm 5.3 can be deduced 
as a consequence of 5.2. 

5.9. Theorem. Suppose f : A — Candg: B — D are bijections. Then there is a canonical 
bijection f x g: Ax B— Cx D given by 


(f x g)(a,b) = (f(a), 9(0)) = (ae Abe B). 
Proof. First, f x g is a function mapping A x B into C x D. Also f~! x g~+ (which sends 
(c,d) to (f~+(c),g~1(d))) is a function from C x D to Ax B. One checks immediately that 
f-! x g~1 is the two-sided inverse of f x g, so both maps are bijections. Oo 
This result extends immediately to Cartesian products of finitely many sets. 
5.10. Theorem. Suppose f; : S; — T; are bijections, for 1 <7<k. Then the map 
f=fix fox-->& fe: Si x-++ x Sp 3 TT, x +++ x Tk, 

given by f(s1,52,---,5k) = (f1(s1), fo(s2),---; fe(se)), is a bijection with inverse f,* x 
Se ESS ag 


5.11. Theorem: Bijective Product Rule for k Sets. Suppose n1,...,n,% are given 
positive integers, and n = njn2---nx. There are canonical bijections 


P= Pna,...sny_ * Ma X Na X--* X Dk > D, De = Pinang ee AL Ba Aes Dike 
The map p is defined by 
k 
D(C Cx.rascn) = > Ci Il Nj. 
i=1 jit 


The inverse map p’ is computed via the following algorithm. Given an input m with 0 < 
m <n, divide m by nz to get a quotient g, and remainder rz. Next, divide qx by nz_1 to 
get a quotient q,_; and remainder rz,_ 1. Continue in this way, dividing q; by n;-1 to get a 
quotient qj, and remainder r;_,, for 1 < i < k. After completing all these divisions, set 
p'(m) = (11, 172,---;Tk)- 

Proof. We use induction on k to show that p is a bijection. When k = 1, p is an identity map. 
When & = 2, the result follows from the bijective product rule for two sets. Finally, assume 
k > 2 and the result is already known for products of k — 1 sets. Writing n’ = ngn3--- ng, 
observe that 


k k 
ci(ng-+: ne) + So II 5 


i=2  jsitl 


s 

= 

3 

com 

oS 
Q 

& 
° 

cad 

ae 
II 


cn’ + Dno,....n% (ca, aang Ck) 


Pni,n' (C1, Pro +E (co, aati: Ck))- 


I 


I 
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This means that py,.n, is the composition of the two bijections 


sala 


idny XPno,...ne >My X (Mg X +++ X MK) > My Xn! and pp, m2 Ma Xn! 1. 


To show that p’ is the inverse of p, it is sufficient to verify that p’(p(ci,...,cK)) = (c1,.--; Ck), 


since we already know that |ny x --- x nx| = |n|. Again we use induction. Note that 
k k-1 
Pni,..., mu (Cis vee) = YG II nj = Ce +m >. ci II gs 
i=l i<j<k i=l i<j<k 
This shows that the quotient and remainder when we divide p(ci,...,cx) by nz are 


k-1 
= ee | eH Prat lente a) 


i=l i<j<k-1 


and rz = cx, respectively. So the first division step successfully recovers cz. The remaining 
divisions compute 


Dye nS 2 sissy C3 Pies et); 


which equals (c1,...,¢x%—1) by induction hypothesis. oO 


5.12. Example. Suppose (n1,n2,73,74,N5) = (4,6,5,4,2), son = nyngngn4nz = 960. 
Then 


p(3,1,0,2,1) =3-(6-5-4-2)+1-(5-4-2) +0- (4-2) +2: (2) +1 = 765. 


To compute p~ (222), first divide 222 by ns = 2 to get gs = 111 and rs = 0. Then divide 
111 by ng = 4 to get q, = 27 and rg = 3. Then divide 27 by n3 = 5 to get gg = 5 and 
r3 = 2. Then divide 5 by ng = 6 to get qg = 0 and ro = 5. Finally, divide 0 by n; = 4 to 
get q, = 0 and r; = 0. We conclude that p~!(222) = (0,5, 2,3, 0). 


yeeey 


(for 0 < z < b*). Here, p and p7! are bijections between the set of words {0,1,...,b—1}* and 
the set. of integers {0,1,2,...,b*—1}. Taking b = 10, we see that the decimal representation 
of positive integers is a special case of the bijective product rule. 

5.14. Remark. Here is another algorithm for computing pp}, (m) = p~'(p(c,---, ce) 
that recovers the numbers (c;,...,c,) from left to right. First, divide m by ngn3---nz to 
obtain a quotient gq; and a remainder r;. Set c; = qi, and recover (c2,...,c%) by recursively 


peeey 


does implement the inverse of Dp, |... nj: 


We can now describe the general strategy for converting informal counting arguments 
based on the product rule to ranking and unranking algorithms. When using the product 
rule, we uniquely construct objects in a set S by making k choices, such that there are always 
n; ways to make the ith choice. There is often a natural ordering of the choices that can be 
made at the ith stage, given the choices that have already been made; thus we can number 
the available choices at this stage 0,1,...,n;—1. Then the informal construction process for 
manufacturing objects in S can be translated into a formal bijection f : my x --- xX nk > S, 
where f(c1,...,¢%) is the object constructed by making the choice numbered c; at the ith 
stage, for 1 <i<k. To solve the unranking problem for $, we use the composite bijection 
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Similarly, if we can find an algorithm that recovers the choices c; from the final object x € S, 
then we can compute f~!. The ranking problem for S is then solved by the bijection 


=e 
Pry,...np 0S * Sn. 


In the next several sections, we apply this idea to find ranking and unranking bijections for 
many commonly occurring families of combinatorial objects. 


DT 


5.4 Ranking Words 


5.15. Example: Four-Letter Words. Let S be the set of all four-letter words. We can 
build elements of S by choosing the first letter, then the second, third, and fourth letters. 
To describe this choice process formally as a bijection, let A = {a,b,...,2z} be the alphabet. 
We identify a word w = wi wow3w4 € S with the 4-tuple (wi, wo,w3,ws) € AX AX AXA. 
Choose a fixed bijection f : A — 26; for instance, we can use the standard alphabetical 
ordering given by 


f(a) =0, f(b) =1, fl) =2,..., Fly) = 24, f(z) = 25. 


Then f x fx fx f is a bijection from S$ = Ax Ax Ax A to 26*. Composing with the bijection 
p : 26 x 26 x 26 x 26 — 264 — 456,976, we obtain a ranking map r: S — 456,976. For 
example, 


r(goop) = 126,26,26,26(6, 14, 14, 15) =6- 263 +14. 267 +14. 26! +15= 115, 299; 


r(pogo) = pr6,26,26,26(15, 14,6, 14) = 15- 26° + 14-267 + 6 - 26' + 14 = 273, 274. 


The inverse of r is the unranking map u : 456,976 — S. To compute u(x), we first express 
x in base 26 (this is what De. sho oe does), and then use the inverse of f to convert back 
to letters. For example, 


u(200, 000) = u(11 - 26° + 9 - 267 + 22.26 +8) = f-'(11)f71(9) f- 1 (22) f7*(8) = ljwi. 


In general, if A is an m-letter alphabet, we can rank the set of k-letter words A* by 
fixing a bijection f : A — m and computing 


r(w wa ale Wk) = Pm,m,....m(f(w1), ta See) f(we)). 


To unrank an integer z € mX, write z = dp_1 -+-do in base m and then replace each digit 
d; by f~*(di). 


5.16. Example: Three-Letter Words. Now consider $ = {a,b,c}%. Define f : {a,b,c} > 
3 by f(a) =0, f(b) =1, and f(c) = 2. The ranking map r : S' — 27 is defined by 


r(wi wows) = f(wi)-9+ f(we)-3+ f(ws). 
To define the unranking map u: 27 — S, write z € 27 as z = dod,dp in base 3. Then 
u(z) = u(dadido) = f~*(d2)f~* (di) f-* (do). 


These bijections set up the following one-to-one correspondences between S, 3 x 3 x 3, and 
27: 
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aaa - 000=0 baa 100=9 caa< 200=18 
aab << 001=1 bab 101=10 cab <— 201=19 
aac © 002=2 bac+ 102=11 cac — 202=20 
aba — 010=3 bba<« 110=12 cba < 210=21 
abb — 011=4 bbb — 111=13 cbb © 211=22 
abe © 012=5 bbe — 112=14 che © 212=23 
aca ~ 020=6 bea 120=15 cca 220=24 
acb @— 021=7 bcb — 121=16 ccb — 221=25 
acc © 022=8 bee — 122=17 ccc @— 222=26 


Notice that, as z runs from 0 to 26, the words in S are generated in alphabetical order. 


More generally, using the bijective product rule will generate the elements of a set S 
in a certain lexicographic order that is determined by the nature of the “choice bijection” 


peeeg 


in m1 ways) is deemed “most significant,” and the last choice (which can occur in nz ways) 
is “least significant.” If we unrank 0,1,...,2— 1 in this order, we obtain a list that begins 
with the ng---n,z objects that can be made by choosing zero in the first choice. Next we get 
all the objects that can be made by choosing one in the first choice, etc. Each such sublist 
is also arranged lexicographically, according to the choices made at stages 2,3,...,k. 


5.17. Example: Words with Restrictions. Let S be the set of four-letter words 
Ww 1W2w3w4 that begin and end with consonants and have a vowel in the second position. 
Choosing letters from left to right and using the product rule, we see that |S] = 21-5-26-21 = 
57,330. Let C, V, and A denote the set of consonants, vowels, and all letters, respectively. 
The usual alphabetical order defines bijections C — 21, V — 5, and A — 26; for example, 


V-5Bviaar 0, er 1, iP 2, 0 3, ur 4. 
We obtain a ranking map r : S — 57,330 by defining 
r(w1 w2w3wa) = 721,5,26,21(W, Wo, Wz, W4), 
where wi denotes the image of w; under the appropriate bijection. For example, 
r(host) = pai.5,26,21(5, 3, 18, 15) = 5- (5-26-21) + 3- (26-21) + 18- (21) +. 15 = 15, 681. 


We unrank by applying Pay 6 2601 and then decoding to letters. For example, repeated 
division shows that 


P21.5,26,21 (44001) = (16,0, 15,6), 


and therefore u(44001) = vapj. This unranking method generates the words in S in alpha- 
betical order. 


5.18. Example: License Plates. A California license plate consists of a digit, followed by 
three letters, followed by three digits. We can use the preceding ideas to rank and unrank 
license plates. For instance, 


r(3PZY292) = P10,26,26,26,1000(3, 15, 25, 24, 292) = 63, 542, 292. 
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5.5 Ranking Permutations 


In the examples considered so far, the choices made at each stage of the product rule did 
not depend on what choices were made in previous stages. This section studies the more 
complicated situation where the available choices do depend on what happened earlier. We 
illustrate this situation by solving the ranking and unranking problems for permutations. 

Suppose A is an n-letter alphabet. Recall that a k-permutation of A is a word w = 
WwW 1W2++: Wr, where the w,’s are distinct elements of A. Let S' be the set of all k-permutations 
of A. Using the ordinary product rule, we build elements w in S by choosing w; € A 
in n ways, then choosing w. € A ~ {w;} in n — 1 ways, and so on. At the ith stage 
(where 1 < i < k), we choose wu, € A ~ {wi,we,...,wi-i} in n — (i — 1) ways. Thus, 
|S] = n(n —1)---(n—k+1) = (n) lx. Notice that the set of choices available at the ith 
stage depends on the choices made earlier, but the cardinality of this set is independent of 
previous choices. (This last fact is a key hypothesis of the product rule.) 

Let us rephrase the preceding counting argument to obtain a bijection between S and 
the product set nxn —1x---xn—k+1. Fix a total ordering x = (a9, 71,...,%p_1) of the 
letters in A; equivalently, fix a bijection x : n — A. Suppose w = wyw2---w, € S. We must 
map w to a k-tuple (j1, j2,---, jx), where 0 < 7; <n—(i—1). To compute j1, locate w1 in 
the sequence (29, 41,---,£n-—1), let j1 be the number of letters preceding w in the sequence, 
and then erase w, from the sequence to get a new sequence x’. To compute je, find w2 in 
the sequence x’, let j2 be the number of letters preceding it, and then erase we to get a new 
sequence x”. Continue similarly to generate the remaining j;’s. This process is reversible, as 
demonstrated in the next example, so we have defined the desired bijection. Combining this 

= 4 we obtain the 
desired ranking and unranking maps. One may verify that these maps correspond to the 
alphabetic ordering of permutations specified by the given total ordering of the alphabet A. 


5.19. Example. Let n = 8, k = 5, and A = (a,b,c,d,e,f,g,h) with the usual alphabetical 
ordering. Let w = cfbgd € S. We compute (j1,..., 75) as follows: 


2 letters precede c in (a,b,c,d,e,f,g,h), so 71 = 2; 

4 letters precede f in (a,b,d,e,f,g,h), so jo = 4; 

1 letter precedes b in (a,b,d,e,g,h), so jg = 1; 

3 letters precede g in (a,d,e,g,h), so ja = 3; 

1 letter precedes d in (a,d,e,h), so js =1. 
Thus, cfogd +> (2,4,1,3,1). The rank of this word is therefore 

ps,7,6,5,4(2, 4, 1,3,1) =2-(7-6-5-4)+4- (6-5-4) 4+1-(5-4)+3- (4) +1 = 2193. 
Next, let us unrank the integer 982. First, repeated division gives 
Ps,7,6,5,4(982) = (1, 1, 1,0, 2). 

Since 7; = 1, the first letter of the desired word must be b. Removing b from the alphabet 
gives (a,c,d,e,f,g,h). Since 72 = 1, the second letter of the desired word is c. Removing c 
from the previous list gives (a,d,e,f,g,h). Continuing in this way, we see that 982 unranks to 
give the word bcdag. 
5.20. Example. Let S be the set of permutations of (1,2,3,4,5,6). Using the procedure 
above to rank the permutation (4,6, 2,1, 5,3), we first compute (j1,...,j6) = (3,4, 1,0, 1,0) 
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tion (4, 2,5, 1,6,3). 
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5.6 Ranking Subsets 


In 1.42, we used the product rule to prove that the number of k-element subsets of an 
n-element set is (7) — ICE This enumeration result was obtained indirectly, by enu- 
merating k-permutations of an n-element set in two ways and then dividing the resulting 
equation by k!. In general, the operation of division presents serious problems when attempt- 
ing to construct bijections. Therefore, we will adopt a different approach to the problem of 
ranking and unranking subsets. Instead of using the bijective product rule, we will apply 
the bijective sum rule to the recursion characterizing the binomial coefficients. This will 
lead us to recursive algorithms for ranking and unranking subsets. 

For convenience, write C'(n,k) for the number of k-element subsets of an n-element set. 
In 2.25, we saw that these numbers satisfy the recursion 


C(n,k) =C(n—1,k)+C(n-1,k-1) O<k <n) (5.1) 


with initial conditions C(n,0) = 1. This recursion came from a combinatorial argument 
involving the sum rule. Using the bijective sum rule instead will lead directly to recursively 
defined bijections for ranking and unranking. For each alphabet A, introduce the temporary 
notation S;(A) to denote the set of all k-element subsets of A. We assume that all alphabets 
to be considered are equipped with some fixed total ordering that allows us to rank and un- 
rank individual letters of the alphabet. Suppose A = (2, 21,...,@n—1) is such an alphabet 
with n letters. We can write 5;,(A) as the disjoint union of sets T and U, where T consists 
of all subsets that do not contain 7,_; and U consists of all subsets that contain 7,,_1. 
Note that T = Sp(A ~ {an-1}), and U corresponds to S,-1(A ~ {an-1}) via a bijection 
that deletes r,-1 from a subset belonging to U. We can use recursion to obtain ranking 
and unranking maps for S;,(A ~ {%p-1}) and S;_1(A ~ {an_1}), as these involve subsets 
drawn from smaller alphabets. Then we combine these maps using the bijective sum rule 
to get ranking and unranking maps for S;(A). 

Writing out the definitions, we arrive at the following recursive ranking algorithm for 
mapping a subset B € S;,(A) to an integer: 


e Ifk =0 (so B=9), then return the answer 0. 


e If k > 0 and the last letter x in A does not belong to B, then return the ranking of B 
relative to the set 5;,(A ~ {x}), which we compute recursively using this very algorithm. 


e If k > 0 and the last letter x in A does belong to B, let i be the ranking of B’ = B ~ 
{x} relative to the set S,_1(A ~ {x}) (computed recursively), and return the answer 
i+ C(n —1,k). Note that C(n —1,k) can be computed using the recursion (5.1) for 
binomial coefficients. 


The inverse map is the following recursive unranking algorithm that maps an integer m 
to a subset B € S;,(A): 


e If k =0 (so m must be zero), then return 9. 


e Ifk>O0and0<m< C(n-1,k), then return the result of unranking m relative to the 
set S,(A ~ {x}), where « is the last letter of A. 


e Ifk > Oand C(n—-1,k) < m < C(n,k), then let B’ be the unranking of m— C(n—-1,k) 
relative to the set S,_1(A ~ {x}), and return B’ U {a}. 
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5.21. Example. Let A = (a,b,c,d,e, f,g,h), and let us rank the subset B = {c,d, f,g} € 
S4(A). Since h ¢ B, we recursively proceed to rank B relative to the 7-letter alphabet 
A, = (a,b,c, d,e, f,g). The new last letter g belongs to B, so we must add C(7 —1,4) = 15 
to the rank of B, = {c,d, f} relative to (a, b,c,d,e, f). The last letter f belongs to B,, so 
we must add C(6 — 1,3) = 10 to the rank of By = {c,d} relative to (a, b,c, d,e). This is the 
same as the rank of Bg relative to (a,b, c,d), which is C(4 — 1,2) = 3 plus the rank of {c} 
relative to (a,b,c), which is C(3 — 1,1) = 2 plus the rank of @) relative to (a,b). Two more 
reductions reveal that the latter rank is zero. Adding up the contributions, we see that the 


rank of B is 
3-1 4-1 6-1 7-1 
Cr )*Ca )+Ca ) Ca =m 
Generalizing the pattern in the previous example, we can give the following non-recursive 
formula for the rank of a subset. 


5.22. Sum Formula for Ranking Subsets. If A = (2%0,%1,...,¢%m—-1) and B = 
{©i,,Lig,---, Li, } where ty < ig < +++ < %, then the rank of B relative to S;,(A) is ay (Gar 


A routine induction argument can be used to prove this formula formally. 


5.23. Example. Now we illustrate the recursive unranking algorithm. Let us unrank the 
integer 53 to obtain an object B € S4(A), where A = (a,b,c,d,e, f,g,h). Here n = 8 
and k = 4. Since C(7,4) = 35 < 53, we know that h € B. We proceed by unranking 
53 —35 = 18 to get a 3-element subset of (a, b,c, d,e, f,g). Now C(6,3) = 20 > 18, so g does 
not lie in the subset. We proceed to unrank 18 to get a 3-element subset of (a,b,c, d,e, f). 
Now C(5,3) = 10 < 18, so f does belong to B. We continue, unranking 18 — 10 = 8 to 
get a two-element subset of (a,b,c, d,e). Since C(4,2) = 6 < 8, e € B and we continue by 
unranking 2. We have C(3,1) =3 > 2, sod ¢ B. But at the next stage C(2,1) = 2 < 2, so 
c € B. We conclude, finally, that B = {c,e, f, h}. 


As before, we can describe this algorithm iteratively instead of recursively. 


5.24. Unranking Algorithm for Subsets. Suppose A = (x0, 21,...,%n—1) and we are 
unranking an integer m to get a k-element subset B of A. Repeatedly perform the following 
steps until k becomes zero: let i be the largest integer such that C(i,k) < m; declare that 
x; € B; replace m by m— C(i,k) and decrement k by 1. 


We close with a remark about the ordering of subsets associated to the ranking and 
unranking algorithms described above. Let x be the last letter of A. If we unrank the 
integers 0,1,2,... in this order to obtain a listing of $;,(A), we will obtain all k-element 
subsets of A not containing z first, and all k-element subsets of A containing x second. Each 
of these sublists is internally ordered in the same way according to the next-to-last letter of 
A, and so on recursively. In contrast, if we had used the bijective sum rule on the recursion 


C(n,k) = C(n—-1,k -1)+C(n—1,k) 


(in which the order of the summands is swapped), then the ordering rules at each level of this 
hierarchy would be reversed. Similarly, the reader can construct variant ranking algorithms 
in which the first letter of the alphabet is considered “most significant,” etc. Some of these 
variants are explored in the exercises. 
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5.7 Ranking Anagrams 


Next we study the problem of ranking and unranking anagrams. Recall that R(aj! --- az") 
is the set of all words of length n = nj +---+ nx consisting of n; copies of a; for 1 <i <k. 
We have seen (§1.9) that these sets are counted by the multinomial coefficients: 


n n! 
R(af? «--ap*) =( )-aoE mT 
N1,22,.--,Mk Ny-Ng. +++ ME: 


There are at least three ways of deriving this formula. One way counts permutations of 
n distinct letters in two ways, and solves for the number of anagrams by division. This 
method is not easily converted into a ranking algorithm. A second way uses the product 
rule, choosing the positions for the n; copies of a,, then the positions for the nz copies of 
a2, and so on. Combining the bijective product rule with the ranking algorithm for subsets 
presented earlier, this method does lead to a ranking algorithm for anagrams. A third way 
to count anagrams involves finding recursions involving the multinomial coefficients (§2.5). 
This is the approach we will pursue here. 

Let C(n;n1,...,nx%) be the number of rearrangements of n letters, where there are n; 
letters of type 7. Classifying words by their first letter leads to the recursion 


k 
C(njm,...,ne) = >> C(m—1jym,..., —1,..., 78). 


i=1 
Applying the bijective sum rule to this recursion, we are led to recursive ranking and 
unranking algorithms for anagrams. 

Here are the details of the algorithms. We recursively define ranking maps 


sage Ra at) Sm, 


T= rapt. 
where m = (ny +--+ + nx«)!/(ni!--+-ng!). If any n; is negative, r is the function with graph 
0. If all n;’s are zero, r is the function sending the empty word to 0. To compute r(w) in 
the remaining case, suppose a; is the first letter of w. Write w = a;w’. Return the answer 

r(w) = S- C(n—1311,...,nj —1,..., mK) + as (w’), 
j<i 


where r(w’) is computed recursively by the same algorithm. 
Next, we define the corresponding unranking maps 
U = Ugni.gte im Rial'+:+a,"). 
Use the only possible maps if some n; < 0 or if all n; = 0. Otherwise, to unrank s € m, first 
find the maximal index i such that n; > 0 and 7 ,.,C(n — 1ymi,...,nj —1,--.,mk) <8 
let s’ be the difference between s and this sum. Recursively compute 


finally, return the answer w = a;w’. This unranking algorithm induces a listing of the 
anagrams in R(aj}'---a;,") “in alphabetical order” relative to the alphabet ordering a, < 
a2 < +++ < ak. 
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5.25. Example. Let us compute the rank of the word w = abbcacb in R(a?b3c?); here 
n= 7, ny = 2, no = 3, and ng = 2. Erasing the first letter a, we see that the rank of w 
equals zero plus the rank of w, = bbcacb; now n = 6, ny = 1, ng = 3, and n3 = 2. Erasing 


b, we must now add ie 32) = 10 to the rank of we = beach; now n = 5, ny = 1, neg = 2, 


and n3 = 2. Erasing the next b, we must add (6 aS) = 6 to the rank of ws = cacb; now 


n=4,n, =1, no = 1, and ng = 2. Erasing c, we must add ee) + (G43) = 6 to the rank 
of wa = acb; now n = 3, ny = 1, no = 1, and ng = 1. Continuing in this way, one sees that 
the rank of acb is 1. Thus, the rank of the original word is 10 +6+6+1 = 23. 

Next, let us unrank 91 to obtain a word w in R(a?b%c”). To determine the first letter 


of w, note that 0 < 91, (, $4) = 60 < 91, but (,55) + (,5.) = 150 > 91. Thus, the first 


letter is b, and we continue by unranking 91 — 60 = 31 to obtain a word in R(a?b7c?). This 


time, we have 0 < 31, Ces) = 30 < 31, but Geo) + Ges) = 60 > 31. So the second letter 


is b, and we continue by unranking 31 — 30 = 1 to obtain a word in R(a7b'c?). It is routine 
to check that the next two letters are both a, and we continue by unranking 1 to obtain a 
word in R(b'c?). The word we get is cbc, so the unranking of 91 is the word bbaacbe. 


DT 


5.8 Ranking Integer Partitions 


In this section, we devise ranking and unranking algorithms for integer partitions by apply- 
ing the bijective sum rule to the recursion 2.42. Let P(n,k) be the set of integer partitions 
of n with largest part k, and let p(n,k) = |P(n,k)|. Recall from 2.42 that these numbers 
satisfy 

p(n, k) = p(n —k,k) + p(n—-1,k -1) (n, k > 0). 


The first term on the right counts elements of P(n,k) in which the largest part occurs 
at least twice (deleting the first copy of this part gives a bijection onto P(n — k,k)). The 
second term on the right counts elements of P(n,k) in which the largest part occurs exactly 
once (reducing this part by one gives a bijection onto P(n — 1,4 — 1)). Combining these 
bijections with the bijective sum rule, we obtain recursively determined ranking maps r = 
Tn,k 1 P(n,k) > p(n,k). To find ry,,(~), consider three cases. If 4 has only one part (which 
happens when n = k), return 0. If k = wy = pug, return rp—z,k((2, M3,---)). If k = 1 > pa, 
return p(n — k,k) + rn—1,k—1((ta — 1, pe, ...)). The unranking maps u = Un,x : p(n, k) > 
P(n,k) operate as follows. To compute u(m) where 0 < m < p(n,k), consider two cases. 
If0 < m < p(n—k&,k), recursively compute v = Un—z,4(m) and return the answer p = 
(k,4,V2,...). Ifp(n—k, k) < m < p(n, k), recursively compute v = tn—1,4-1(m—p(n—k, k)) 
and return the answer pp = (1, + 1,12, V3,...). 


5.26. Example. Let us compute 7g,3(4), where w = (3,3,1,1). Since wy = pe, the rank 
will be r5.3(v), where v = (3,1,1). Next, since 1) 4 v2, we have 


15,3(3, 1,1) = p(2,3) + r4,2(2, 1,1) = r4.2(2, 1,1). 
The first two parts of the new partition are again different, so 
r42(2, 1,1) = p(2,2)+731(1,1,1) = 14+731(1,1, 1). 
After several more steps, we find that r3,1(1,1,1) = 0, so rg.3(~) = 1. Thus p is the second 


partition in the listing of P(8,3) implied by the ranking algorithm; the first partition in 
this list, which has rank 0, is (3, 3, 2). 
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Next, let us compute js = uyo,4(6). First, p(6,4) = 2 < 6, so pw will be obtained by adding 
one to the first part of vy = ug,3(4). Second, p(6,3) = 3 < 4, so v will be obtained by adding 
one to the first part of p = ug 2(1). Third, p(6,2) = 3 > 1, so p will be obtained by adding 
a new first part of length 2 to € = ug.2(1). Fourth, p(4,2) = 2 > 1, so € will be obtained 
by adding a new first part of length 2 to ¢ = wa(1). Fifth, p(2,2) = 1 < 1, so ¢ will be 
obtained by adding one to the first part of w = u31(0). We must have w = (1,1,1), this 
being the unique element of P(3,1). Working our way back up the chain, we successively 
find that 


€=(2,1,1), €=(2,2,1,1), p=(2,2,2,1,1), v= (3,2,2,1,1), 
and finally 4 = uyo,4(6) = (4, 2, 2, 1,1). 
Now that we have algorithms to rank and unrank the sets P(n,k), we can apply the 
bijective sum rule to the identity 
p(n) = p(n, n) + p(n,n— 1) +--+ + p(n, 1) 
to rank and unrank the set P(n) of all integer partitions of n. 


5.27. Example. Let us enumerate all the integer partitions of 6. We obtain this list of 
partitions by concatenating the lists associated to the sets 


P(6,6), P(6,5), --- , P(6,1), 


written in this order. In turn, each of these lists can be constructed by applying the un- 
ranking maps ue, to the integers 0,1,2,...,p(6,k) — 1. The reader can verify that this 
procedure leads to the following list: 


(6), (5,1), (4,2), (4,1,1), (3,3), (3,2,1), (3,1,1,1), 
(2, 2,2), (2,251, 1), (2, 1,1, 1,1), (1, 1,1, 1,1, 1). 


One may also check that the list obtained in this way presents the integer partitions of 
n in decreasing lexicographic order (as defined in 10.36). 


DS 


5.9 Ranking Set Partitions 


Next, we consider the ranking and unranking of set partitions (which are counted by Stirling 
numbers of the second kind and Bell numbers). The recursion for Stirling numbers involves 
both addition and multiplication, so our recursive algorithms will use both the bijective 
sum rule and the bijective product rule. 

Let SP(n,k) be the set of all set partitions of {1,2,...,n} into exactly k blocks, and let 
S(n,k) =|SP(n,k)| be the associated Stirling number of the second kind. Recall from 2.52 
that 

S(n,k) = S(n-1,k-1)+kS(n—-1,k) (n,k > 0). 


The first term counts set partitions in SP(n,k) such that n is in a block by itself; removal 
of this block gives a bijection onto SP(n — 1,4 —1). The second term counts set partitions 
m in SP(n,k) such that n belongs to a block with other elements. Starting with any set 
partition 7’ in SP(n — 1,k), we can build such a set partition  € SP(n,k) by adding n 
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to any of the k nonempty blocks of 7’. We number the blocks of 7’ using 0,1,...,k — 1 
by arranging the minimum elements of these blocks in increasing order. For example, if 
mw = {{6,3,5}, {2}, {1, 7}, {8,4}}, then block 0 of a’ is {1,7}, block 1 is {2}, block 2 is 
{3,5,6}, and block 3 is {4, 8}. 

The ranking maps r = rn, : SP(n,k) — S(n,k) are defined recursively as follows. Use 
the only possible maps if k < 0 or k > n. For 0 < k < n, compute rp,(7) as follows. If 
{n} € a, return the answer rn_1,4-1(7 ~ {{n}}). Otherwise, let 2’ be obtained from 7 by 
deleting n from whatever block contains it, and let i be the index of the block of 7’ that 
used to contain n. Return S(n — 1,k — 1) + pp,s(n—1,k) (4, Pn—1,k(7))- 

Similarly, we define the unranking maps u = n,n : S(n,k) — SP(n,k) as follows. 
Assume n,k > 0 and we are computing Un,.(m). If 0 < m < S(n—1,k —1), then return 
Un—1,h-1(m) U{{n}}. If S(n—1,k-1) < m < S(n,k), first compute (i, 7) = Py, $(n—1,8) (m— 
S(n — 1, —1)). Next, calculate the partition 7’ = un—1,,(7) by unranking 7 recursively, 
and finally compute 7 by adding n to the ith block of z’. 


5.28. Example. Let us compute the rank of 7 = {{1,7}, {2,4, 5}, {3, 8}, {6}} relative to 
the set SP(8,4). In the first stage of the recursion, removal of the largest element 8 from 
block 2 leaves the set partition 7’ = {{1, 7}, {2,4,5}, {3}, {6}}. Therefore, 


rg,4(T) = S(7, 3) + 2S(7, 4) + r7,4(7’) = 301+ 2-350+ r7,4(7’). 


(See Figure 2.21 for a table of Stirling numbers, which were calculated using the recursion 
for S(n,k).) In the second stage, removing 7 from block 0 leaves the set partition 7” = 


{{1}, {2, 4,5}, {3}, {6}}. Hence, 
r7,4(7) = S(6, 3) + 0S(6, 4) + r6,4(7’) = 90+ r6,4(’). 


In the third stage, removing the block {6} leaves the set partition 7°) = {{1}, {2, 4,5}, {3}}, 
and 

rea(t”) = 15,3(0). 
In the fourth stage, removing 5 from block 1 leaves the set partition = {{1}, {2, 4}, {3}}, 


and 
r5,3(7) = (4, 2) + 18(4,3) +raa(n) =74+-6+ra3(r). 
In the fifth stage, removing 4 from block 1 leaves the set partition 7) = {{1}, {2}, {3}}, 


and 
ra3(74)) = $(3,2) + 15(3,3) + 173.3(n) =3 41473 3(7). 


But 73,3(7©)) is zero, since |S.P(3,3)| = 1. We deduce in sequence 
ra3(m)) = 4, r64(a") = 15,3(n) = 17, r7,4(2’) = 107, rg4(r) = 1108. 


Next, let us compute w73(111). The input 111 weakly exceeds $(6,2) = 31, so we must 
first compute P3,99(111 — 31) = (0,80). This means that 7 will go in block 0 of ug¢_3(80). 
Now 80 > $(5,2) = 15, so we compute p3 5,(80 — 15) = (2,15). This means that 6 will go 
in block 2 of us,3(15). Now 15 > S(4,2) = 7, so we compute p3g(15 — 7) = (1,2). This 
means that 5 will go in block 1 of w43(2). Now 2 < $(3,2) = 3, so 4 is in a block by 
itself in ua,3(2). To find the remaining blocks, we compute u3,2(2). Now 2 > $(2,1) = 1, so 
we compute pz 1(2— 1) = (1,0). This means that 3 goes in block 1 of u2,2(0). Evidently, 
uz2(0) = {{1},{2}}. Using the preceding information to insert elements 3,4,...,7, we 
conclude that 
u7,3(111) — cee ‘ee {2, 3, 5}, {4, 6}}. 
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The ranking /unranking procedure given here lists the objects in SP(n, k) in the following 
order. Set partitions with n in its own block appear first. Next come the set partitions with 
n in block zero (i.e., n is in the same block as 1); then come the set partitions with n in 
block one, etc. By applying the sum and product bijections in different orders, one can 
obtain different listings of the elements of SP(n, k). 

Let SP(n) be the set of all set partitions of n, so |SP(n)| is the nth Bell number. The 
preceding results lead to ranking and unranking algorithms for this collection, by applying 
the bijective sum rule to the disjoint union 


SP(n) = SP(n,1)USP(n,2)U---USP(n,n). 


Another approach to ranking S'P(n) is to use the recursion in 2.53. Details of this approach 
are left as an exercise. 


a 


5.10 Ranking Card Hands 


We now apply the preceding ideas to the problem of ranking and unranking certain poker 
hands. We will use the bijective sum and product rules to transform the counting arguments 
from §1.13 into ranking and unranking bijections. 

In this section, we define Deck = Suits x Values, where 


Suits = {&,>,9, a}; 


Values = { A, 2, 3, 4,5, 6, 7,8, 9, 10, J, Q, K}. 


(A slightly different definition was used in §1.13.) The displayed orderings of the suits 
and values determine ranking and unranking bijections Suits ~ 4 and Values ~ 13. For 
example, r() = 1 = r(2); 11 unranks to the value Q; and 3 unranks to the suit @. 
Combining these maps with the map p4.13 from the bijective product rule, we obtain ranking 
and unranking bijections Deck < 52. Since we are thinking of Deck as the product set 
Suits x Values, the suit of a card is more significant than its value when ranking. With 
these conventions, the list of cards generated by unranking 0,1,...,51 in this order runs as 
follows: 


Ade, 200,..., Kde, AO, 20,..., KO, AO,..., KO, Ab,...,Q0, K&. 


Naturally, all ranking and unranking results in the examples below depend on this chosen 
ordering of the deck. 

Recall that a poker hand is a five-element subset of Deck. To rank or unrank such hands, 
we can use the bijections Deck ~ 52 to reduce to the problem of ranking and unranking 
five-element subsets of 52. This problem was solved in $5.6. More interesting ranking and 
unranking problems arise if we restrict attention to certain special kinds of hands, like a full 
house. We discuss some of these problems next; other examples are treated in the exercises. 


5.29. Example: Four-of-a-kind hands. Let S be the set of all four-of-a-kind poker 
hands. Recall (§1.13) that we can build a hand H € S (via the ordinary product rule) by 
picking one of the 13 values v € Values, and then picking one of the 48 cards in Deck ~ 
(Suits x {v}). The rank of the hand H, relative to this particular construction method for 
S, is 

pi3,as(T(v), 7’ (c)), 


where r : Values — 13 is the ranking function for card values, and r’ is the ranking function 
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on Deck ~ (Suits x{v}) induced by the usual rank function on Deck. More precisely, r’(c) 
is the number of cards preceding c in the standard ordering of the deck after throwing out 
the four cards of value v. If c = (81, v1), one checks that r’(c) = 12r(s,)+r(v1)— x(v1 > v). 

For example, let us rank the four-of-a-kind hand H = {5@, 89,5, 5@, 50}. Here, v =5 
and c = 89 = (9,8). In the full deck, c has rank 13-2+7 = 33. But in the deck with the 5’s 
removed, c has rank r’(c) = 12-2+6 = 30. Accordingly, r(H) = pi3,48(4, 30) = 4-484 30 = 
222. 

To illustrate unranking, let us compute u(600). First, pj3.4g(600) = (12,24). It follows 
that v = u(12) = K. Next, unranking 24 relative to the deck with the four kings deleted 
gives us the card c= AY. Therefore, u(600) = {K&, KO, KO, K&@, AV}. 


5.30. Example: Full house hands. Let S$ be the set of all full house hands. Recall (§1.13) 
that we can build a hand H € S from the data (x, B,y,C), where x € Values, B is a three- 
element subset of Suits, y € Values ~ {x}, and C is a two-element subset of Suits. For 
example, the data 

(x, By, C) = (J, {he ®, a}, 9, {&, Y}) 
generate the full house hand H = {Jé, J), J@, 9%, 90}. The rank of this hand is 


7P13,4,12,6(7 (2), r(B), Ue (y), r(C)), 


where we use the same letter r to denote various ranking functions on the set of choices 
available at each stage. We write r,(y) to emphasize that the ranking function for y € 
Values ~ {a} depends on the value of the previous choice x. For the sample choice sequence 
considered above, we get 


r(H) = 1p13,4,12,6(10, 1, 8, 1) = 2880 + 72 + 48 + 1 = 3001. 


(We are using the ranking functions for k-element subsets of Suits, which were discussed 
in §5.6.) Observe that the answer depends critically on the precise ordering of the choices 
we made in the counting argument. If we had chosen the data in the order (x,y, B,C), for 
example, then we would obtain a different answer for r(H). 

To illustrate unranking, let us compute u(515). First, 


Pis.4.12,6(515) = (1,3, 1,5). 


Continuing to unrank, x = u(1) = 2, B=u(3) = {9,9, @}, y = uz(1) = 3 (since the value 
2 has been deleted), and C = u(5) = {V, @}. So 


u(515) = {20, 20, 2@, 39, 3a}. 


5.31. Example: Two-pair hands. Let S' be the set of two-pair poker hands. This time, 
we build H € S from data (B,C, D,z), where B is a two-element subset of Values, C' is 
a two-element subset of Suits, D is a two-element subset of Suits, and z = (s,v) is a card 
such that the value v is not in B. Using the bijective product rule, we have 


r(A) = p78,6,6,44(7(B), r(C), (D), ra (z)). 
For example, let us find the rank of H = {2@, 2, 9&,99, K}, which arises from the data 
(B,C, D, z) = ({2,9}, {&, @}, {&, O}, (0, K)). 


The ranking formula developed in §5.6 gives r(B) = G) + (3) = 29; similarly, r(C) = 3 
and r(D) = 1. After removing all 2’s and 9’s from the deck, the new rank of KQ is 
rp((, K)) = 11+ 10 = 21. So, finally, 


r(H) = 29-6-6-44+3-6-444+1-444 21 = 46, 793. 
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5.32. Example: Ordinary hands. In $1.13, we built an “ordinary” poker hand H by 
choosing a five-element subset V(H) of Values that avoided one of the ten possible value 
sets for a straight, and then choosing a word in Suits” not all of whose letters are equal (to 
avoid flushes). This argument showed that there are (C(13,5) — 10) - (4° — 4) = 1,302,540 
ordinary poker hands. How can we find a ranking algorithm for this collection of card hands? 

Let Y be the set of all five-element subsets of Values, and let Z = Suits’. We have 
already found ranking functions ry : Y > C(13,5) and rz : Z — 4°. To take the prohibited 
conditions into account, let Y’ = {{A, 2,3, 4,5}, {2,3,4,5,6},...} be the set of ten objects 
in Y corresponding to straights, and let Z’ = {sssss : s € Suits} be the four objects in Z 
corresponding to flushes. We can get ranking functions rf, : Y ~ Y’ — (C(13,5) — 10) and 
r,:Z~ Z' + 4° —4 by setting 


ry(C) =ry(C) -|{O" € Y":ry(C’) < ry(C)}. 


This formula is practical since there are only ten possibilities for C’, and we can compute 
the ranks of these objects in advance. They are: 


0,5, 20, 55, 125, 251, 461, 791, 1278, 1286. 


For example, r({3,4,5,6,7}) = C(2,1) + C(3,2) + C(4,3) + C(5,4) + C(6,5) = 20 and 
r({10, J,Q, K, A}) = C(0,1) + C(9, 2) + C(10, 3) + C(11, 4) + C(12,5) = 1278. Now, the 
rank function on Y ~ Y’ can be computed via the formula 


ry(C)—-1 if0<ry(C) <5; 
ALORS TO hee Ro al 
ry(C)—9 if 1278 < ry(C) < 1286. 
Similarly, we can set 


r'z(w) =rz(w) — |\{w’ € Z's rz(w’) < rz(w)} |. 


In this case, we precompute 


1 (odododode) = p4.4.4,4,4(0,0,0,0,0) = 0; 
7(09O0O00) = paa4.44(1,1,1,1,1) = 341; 
r(QOOOY) = pa44.4,4,4(2, 2,2,2,2) = 682; 
1(@AAMA) = 74,4,4,4,4(3,3,3,3,3) = 1023. 


Since these numbers form an arithmetic progression, we can write 
r'z(w) =rz(w) — [rz(w)/341] (wEeZ~Z'). 


Finally, the overall ranking function for a hand H constructed from the pair (C,w) is 
given by r(H) = pie77,1020(ry-(C), r,(w)). For example, let us compute the rank of the 
hand H = { Ade, 4d, 70, 9d, 10}. For this hand, C = {A,4,7,9,10} and w = &&Obd. 
We calculate 


mo-(1)-()*Qs 


rz(w) =2-47 +1 =33; r,(w) = 33 — 1 = 32; 
r(H) = 1P1277,1020(214, 32) = 214. 1020 + 32 = 218, 312. 


Gy + ) =219; ry (C) =219-5 = 214; 
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As another example, let us unrank 1,000,000. First, pjj7 4929(10°) = (980,400). The 
number 980 is between 791 and 1278 in the list of ranks of objects in Y’, so we recover 
ry(C) = 980+8 = 988. Unranking 988 produces the subset {4,5, 8,9, 12} of 13, which trans- 
lates into the value set {5,6,9, 10, A}. Next, since 400 lies between 341 and 682, we recover 
rz(w) = 402. Repeated division by 4 produces the base-4 number 12102, which translates 
to the value sequence z = U9. In conclusion, u(10°) = {5, 69,9, 10%, KO}. 

This example shows that applications of the difference rule can be difficult to translate 
into ranking and unranking algorithms. Our success here depended on the fact that the 
sizes of the sets being subtracted were quite small, so that their effect on the ranking could 
be specified by a relatively brief case analysis. 


5.33. Remark. In general, the orderings of special card hands obtained above do not 
necessarily arise by restricting the usual ordering of all five-card hands to the given subcol- 
lection. Rather, these orderings arise from the particular ordered sequence of choices used to 
generate these hands. Considerable cleverness may be required to find a ranking algorithm 
for generating a particular subcollection of card hands in lexicographic order. 


5.11 Ranking Dyck Paths 


Recall that a Dyck path of order n is a lattice path from (0,0) to (n,n) that never goes 
below the line y = x. These objects are counted by the Catalan numbers C, = 4"), 
which satisfy the recursion 


n 
CIS O04 edo) 
k=1 
and initial condition Cp = 1 (see 2.33). Recall that the recursion classifies Dyck paths ending 
at (n,n) based on the first point (&,&) at which the path returns to the line y = x after 
leaving the origin (see Figure 2.10). If we use words w € {N, E}?” to encode Dyck paths, 
the first-return recursion corresponds to the factorization w = Nw,Ew2, where w, encodes 
a Dyck path of order k — 1, and wz encodes a Dyck path of order n — k. 

By applying the bijective sum and product rules to the preceding recursion, we can 
obtain recursive ranking and unranking algorithms for Dyck paths. For each n > 0, let Dy, 
be the set of words encoding Dyck paths of order n. Define ranking maps r, : Dy, — Cy as 
follows. When n = 0, r9 maps the empty word to the integer 0. For n > 0, suppose we D, 
has first-return factorization w = Nw,Ewe2, where w; € Cy_1 and we € Cy_x for some k. 
Recursively compute 

k-1 
Pn(w) = S_ Ci-1Cn—j + PO, _1,Cn—e (TH-1(W1); Pn—#(W2)). 
j=l 
The unranking maps un : Cn — Dy are defined as follows. First, uo(0) is the empty word. 
Given n > 0 and z € Cn, find the unique integer k < n with 


\CpaG4 <Zz< perm ores 
J<k ISK 
Next, compute 


(x,y) = De 6 z= S- Cy-1Cn-3 
j<k 
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Recursively determine the words w, = ugz_1(x) and w2 = Un—z(y), and return the answer 
Un(z) = Nw, Eup. 


5.34. Example. Let us compute the rank of the Dyck path w = NNENNEEENE. The 
first-return factorization of w is w = Nw,Ew2 where w; = NENNEE and wz = NE. Here, 
n=5,k-1=3,k=4, and n—k=1. The ranking formula gives 


r5(w) = CoC4 + Ci C3 + C2Co + PC03,C1 (r3(w1), r1(we)). 


Now ri(w2) = 0 since w2 is the only Dyck path of order 1. As for r3(w1), we proceed 
recursively. The required factorization of wi is w1 = Nw3Ewa4 where wz is the empty word 
and ws = NNEE. At this stage of the recursive computation, we have n = 3, k—1 = 0, 
k =1, and n—k=2. Therefore, 


73(W1) = Pco,C2(To(ws), r2(wa)). 
Now ro(w3) = 0; as for wa, we have (writing € for the empty word) 
r2(wa) = CoC1 + Poy co(M1 (NE), ro(€)) = 1 + p1,1(0,0) = 1. 
Recall that the first few Catalan numbers are 


Cy =1, C1 =1, Co =2, C3 =5, Ch = 14, Cy = 42, Cy = 132, Cy = 429. 


Working our way back through the recursive calculations, we find that r3(w1) = pi,2(0, 1) = 
1 and 
r5(w) = 144+54+4+4+p51(1,0) = 234 1 = 24. 


5.35. Remark. The recursive ranking and unranking calculations can be simplified slightly 
by precomputing the ranks of Dyck paths of small order, which occur over and over again 
in the calculations. Observe that our ranking method for D,, lists the Dyck paths in the 
following order: first, all paths whose first return to y = x is at (1,1); second, all paths 
whose first return is at (2,2); and so on. Within each of these sublists, the ordering of 
paths is determined recursively (with the aid of the bijective product rule). Using these 
observations, we can quickly enumerate Dyck paths of order at most 3 in the order implied 
by the unranking algorithm. For n = 0, the list consists of just the empty word. For n = 1, 
we get: NE. For n = 2, we get: NENE, NNEE. For n = 3, we get: 


NENENE, NENNEE, NNEENE, NNENEE, NNNEEE. 


5.36. Example. Let us unrank 211 to obtain a Dyck path of order n = 7. From the 
recursion, we have 


429 = C7 =1-18241-4242-144+5-5414-2+4+42-14 182-1. 


The first step in unranking is to calculate the partial sums on the right side until we find 
the first one larger than the given input 211. We find that 


CoCg + C1Cs + CoC4g = 202 < 211 < 227 = CoCg + Ci Cs + CoC,g + C3C3. 


Therefore k = 4,k-1=3 =n-—k, and (2, y) = ps 5(211 — 202) = (1,4). Using the previous 
example, we have w; = u3(”) = NENNEE and w2 = us(y) = NNNEEE. It follows that 


u7(211) = N NENNEE E NNNEEE. 
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5.12 Ranking Trees 


We know from §3.7 that there are n”~? rooted trees on the vertex set {1,2,...,n} rooted 
at vertex 1. Let B be the set of such trees; we seek ranking and unranking algorithms for 
B. One way to obtain these algorithms is to use the bijective proof of 3.47. In that proof, 
we described a bijection ¢’ : B > A, where A is the set of all functions f : {1,2,...,n}— 
{1,2,...,n} such that f(1) = 1 and f(n) =n. Let C be the set of words of length n — 2 
in the alphabet {0,1,...,1—1}. The map w: A —> C such that U(f) = wi-++Wn—2 with 
w; = f(t+1)—1, is evidently a bijection. Furthermore, the map pn in... 
product rule gives a bijection from C to n®~?. Composing all these bijections, we get the 
desired ranking algorithm. Inverting the bijections gives an unranking algorithm. 


5.37. Example. Consider the rooted tree T’ shown in Figure 3.9. In 3.49, we computed 
g'(T) to be the function g defined by 


gi:lrRl, 22, 32, 49, 59, 6 7, T 6, 8 9, 9 9Y. 


peeeg 


given word as a number written in base 9), we find that r(T’) = 649, 349. 


This application shows how valuable a bijective proof of a counting result can be. If 
we have a bijection from a complicated set of objects to a “nice” set of objects (such as 
functions or words), we can compose the bijection with standard ranking maps to obtain 
ranking and unranking algorithms for the complicated objects. In contrast, if a counting 
result is obtained by some intricate algebraic manipulation, it may not be so straightforward 
to extract an effective ranking mechanism. 


i 


5.13 Successors and Predecessors 


Suppose S$ is a finite set of n objects, and we wish to list all the elements of S in a certain 
order. If we know an appropriate unranking bijection u : n — S, then we can generate the 
desired list by computing u(0), u(1),...,u(m — 1) in succession. However, if the unranking 
map u is complicated, this method of listing S may not be very efficient. 

In many applications, if we know the object z = u(i) that occupies a particular position 
on the list, it may be possible to compute the object that immediately precedes or follows 
z on the list, without ever explicitly computing 7 or applying the algorithm defining u to 
the inputs 7 — 1 or i+ 1. We call u(i — 1) the predecessor of z (relative to the listing 
determined by u), and we call u(i +1) the successor of z. Reversing the ordering of the 
elements of S interchanges predecessors and successors; so, in what follows, we need only 
consider successors. 

The successor problem asks for an efficient algorithm for finding the successor of a given 
object z relative to a given ordering. We could solve this problem by ranking z, adding 1, 
and unranking, but we typically want more elegant solutions. If we can solve the successor 
problem, and if we know what the first object on the list is, then we will have a new method 
for listing all the elements of S. Namely, we start at the first object and then repeatedly 
invoke the successor algorithm until the last object is reached. In computer programming, 
one often uses this general strategy to loop through all elements of some set of combinatorial 
objects. 
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As a typical example, we develop a successor algorithm for listing the words in 
Ria}! ---a;*) in alphabetical order. This example includes as special cases the problem 
of listing all k-element subsets of an n-element set (which can be encoded as words in 
R(0"-*1*)) and the problem of listing all permutations of k letters (take each np = 1). At 
the outset, we fix an ordered alphabet A = {a, <--- < ax}. Our algorithm will consist of 
three functions, called “first,” “last,” and “next.” The “first” and “last” functions return 
the first and last words in R(aj’ ---a;") relative to the alphabetical ordering; explicitly, we 
have 


x — 71 A7r2 Nk, 
first (4,01 0,Me) = ay ag? + <a”; 
— rk Vk-1 n1 
last(Nays+.5 7%) = a," 7 32 ay". 


In the case nj +---+ nx = 0, both functions return the empty word. 

The successor function “next” takes as input the integers n1,...,nzx > 0, as well as 
a word z € R(aj'---a,"). The output of “next” is either the successor z’ of z in the 
alphabetical ordering, or a special flag (called “done”) indicating that z was the last word 
on the list. The operation of “next” is based on the observation that the alphabetical list 
of words consists of all words starting with a; (if m1 > 0), then all words starting with a2 
(if nz > 0), and so on. Within each of these sublists, the words are ordered in the same way 
based on their second letters, and so on recursively. So we can define next(mi,...,”x, 2) 
using the following recursive algorithm. 


e Base Case: If ny +--+ +n, <0 or if z = last(n1,...,n,), return “done.” 


e Recursion: Suppose z = a,z’. Recursively compute w’ = next(n,,...,m; —1,..., mx, 2’). 
If w’ is not the special “done” flag, return a;w’ as the answer. Otherwise, find the 
smallest index j > 7 with n; > 0. If no such index exists, return “done”; else return the 
concatenation of a; and first(n1,...,nj —1,...,nk). 


5.38. Example. Consider z = bdcaccb € R(a'b?c3d'). To compute the successor of 
z, we are first directed to find the successor of 2’ = deaccb € R(a'btc3d'). Continu- 
ing recursively, we are led to consider the words caccb, then accb, then ccb. But ccb is 
the last word in R(a°b'c?d°), so next(0,1,2,0,ccb) returns “done.” Returning to the cal- 
culation of next(1,1,2,0,accb), we seek the next available letter after ‘a’, which is ‘b’. 
We concatenate ‘b’ and first(1,0,2,0) = acc to obtain next(1,1,2,0,accb) = bacc. Then 
next(1,1,3,0,caccb) = cbacc. Continuing similarly, we eventually obtain the final output 
“bdcbacc” as the successor of z. 

If we apply the “next” function to this new word, we strip off initial letters one at a 
time until we reach the suffix “cc,” which is the last word in its class. So, to compute 
next(1,0,2,0,acc), we must find the next available letter after ‘a’, namely ‘c’, and append 
to this the word first(1,0, 1,0) = ac. Thus, next(1,0,2,0,acc) = cac, and working back up 
the recursive calls leads to a final answer of “bdcbcac.” This example shows why we need 
to remember the values n1,...,m% in each recursive call to “next.” 


5.39. Example. If we use the given method to list the permutations of {1,2,3,4}, we 
obtain: 
1234, 1243, 1324, 1342, 1423, 1432, 2134, 2143, ..., 4321. 


5.40. Example. Using the encoding of subsets as binary words (see 1.38), we can list all 2- 
element subsets of {1, 2,3, 4,5} by running through the words in R(1703). (For convenience, 
we choose the alphabet ordering 1 < 0 here.) The words are: 


11000, 10100, 10010, 10001, 01100, 01010, 01001, 00110, 00101, 00011. 
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The associated subsets are: 


{1,2}, {1,3}, {1,4}, {1,5}, {2,3}, {2,4}, {2,5}, {3,4}, {3,5}, {4,5}. 


In general, the method used here lists k-element subsets of {1,2,...,n} in lexicographic 
order. Using the ordering of the letters 0 < 1 would have produced the reversal of the list 
displayed above. In contrast, the ranking method discussed in §5.6 lists the subsets according 
to a different ordering, in which all subsets not containing n are listed first, followed by all 
subsets that do contain n, and so on recursively. The latter method produces the following 
list of subsets: 


{1,2}, {1,3}, {2,3}, {1,4}, {2,4}, {8,4}, {1,5}, {2,5}, {38,5}, {4,5}. 


5.41. Example: Successor Algorithm for Dyck Paths. Let us design a successor 
algorithm for generating Dyck paths of order n, based on the “first-return” recursion (§2.7). 
As above, we will use three routines called “first,” “last,” and “next.” The routine first(n) 
returns the path (NE)”, which returns to the diagonal as early as possible, while the routine 
last(n) returns the path N"E”, which returns to the diagonal as late as possible. If n > 0 
and w encodes a Dyck path of order n, then next(n,w) will return the next Dyck path in 
the chosen ordering, or “done” if w is the last path. This routine works as follows: 


e Base Case: If n < 1 or w = last(n), then return “done.” 


e Recursion: Find the first-return factorization w = Nw,Ew2 of w, where w; € Dz_1 and 
wz € Dyn_x for some k between 1 and n. Now consider subcases. 
— Calculate wi = next(n—k, we). If this path exists, return Nw; Ew as the answer. 


— Otherwise, find wi, = next(k—1, w,). If this path exists, return N w, E first(n—k) 
as the answer. 


— Otherwise, increase k by 1. If the new k is < n, return N first(k — 1) E first(n — k) 
as the answer. 


— Otherwise, return “done.” 


For example, let us compute next(7,N NENNEE E NNNEEEBE). The given input w factorizes 
as w = Nw,Ewe where wy = NENNEE and wz = NNNEEE. Here, k = 4, k — 1 = 3, and 
n—k =3. By inspection, wz = last(3), so we proceed to the next subcase. We are directed 
to compute w‘, = next(3,w1). Here, w; factors as w1 = Nw3Ew4, where w3 is the empty 
word and wy = NNEE. Again, wz, is the last Dyck path of order 2, and this time w3 is also 
the last Dyck path of order 0. So we increase k by 1, and return 


w}, = N first(1) E first(1)=N NE E NE. 
Using this result in the original calculation, we obtain 


next(w) = N w’, E first(3) = N NNEENE E NENENE. 


DT 


5.14 Random Selection 


We now briefly revisit the problem of random selection of combinatorial objects. Suppose 
S is a set of n objects, and we wish to randomly select an element of S. If we have an 
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unranking function u:n— S, we can select the object by generating a random integer in 
n and unranking it. However, it may be impractical to generate such an integer if n is very 
large. If the objects in S are generated by a recursive process, we can often get around this 
problem by making several random choices that are used to build an object in S$ in stages. 


5.42. Example: Subsets. As a typical example, consider the question of randomly choos- 
ing a k-element subset of A = {x1,...,%}. Earlier (§5.6), we developed ranking and un- 
ranking algorithms for this problem based on the recursion 


C(n,k) = C(n-1,k) + C(n—-1,k- 1). (5.2) 


A slight adjustment of the methods used before will lead to an efficient solution for the 
random selection problem. 

Recall that the term C(n — 1,é) in the recursion counts the k-element subsets of A 
that do not contain x,, while the term C(n — 1,k — 1) counts subsets that do contain rp. 
If we choose a random k-element subset of A, the probability that it will not contain x, 
is C(n — 1,k)/C(n,k) = (n — k)/n, and the probability that it will contain rz, is C(n — 
1,k—1)/C(n,k) = k/n. This suggests the following recursively defined random selection 
procedure. To choose a k-element subset A of {21,...,@n}, use a random number generator 
to obtain a real number r € (0, 1]. If r < k/n, declare that x, belongs to A, and recursively 
select a random (k — 1)-element subset of {x1,...,2%,-1} by the same method. If r > k/n, 
declare that x, does not belong to A, and recursively select a random k-element subset of 
{x1,...,%p,-1}. The base cases occur when k = 0 or k = n, in which case there is only one 
possible subset to select. 


5.43. Example: Set Partitions. A similar method can be used to randomly select a 
k-element set partition of an n-element set A = {21,...,2n}. Recall that these objects are 
counted by the Stirling numbers S(n,k), which satisfy the recursion 


S(n,k) = S(n-—1,k-1) + kS(n-1,k) (n,k > 0). 


Recall that the first term on the right counts set partitions with z,, in a block by itself, while 
the second term counts set partitions in which x, occurs in a block with other elements. 
The base cases of the selection procedure occur when k = 0 or k = n; here, there is at most 
one possible object to select. For the main case, assume n > 0 and 0 < k < n. Choose a 
random real number r € [0,1]. Consider the quantity ro = S(n —1,k — 1)/S(n,k), which 
can be computed (at least approximately) using the Stirling recursion. Note that ro is the 
probability that a random set partition of A into k blocks will have xz, in a block by itself. 
If r < ro, recursively select a set partition of {71,...,@%n—1} into k — 1 blocks, and append 
the block {z,} to this set partition to obtain the answer. In the alternative case r > ro, 
recursively select a set partition {B,,...,B,} of {a1,...,%,-1} into k blocks. Now, choose 
a random integer 7 in the range {1,...,k}, and insert n into block B; to obtain the answer. 
Such an integer i can be found, for example, by choosing another random real number 
s € [0,1], multiplying by &, and rounding up to the nearest integer. 


5.44. Example: Permutations. As a final example, consider the problem of randomly 
generating a permutation of {1,2,...,n}. We can convert the standard counting argument 
(based on the product rule) into a random selection procedure. However, some care is 
required, since the available choices at each stage depend on what choices were made in 
previous stages. 

Recall that one method for building a permutation w = wiw2--:w, of {1,2,...,n} is 
to choose w, to be any letter (in n ways), then choosing we to be any letter other than w 
(in n — 1 ways), then choosing ws; to be any letter other than w, or we (in n — 2 ways), 
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and so on. The associated random generation algorithm would operate as follows. Generate 


random integers i; € {1,2,...,n}, tg © {1,2,...,n—1}, ..., t, © {1}. Define wy = 44. 
Define wz to be the igth smallest element in the set {1,2,...,n} ~ {w,}. In general, define 
w, to be the i;th smallest element in the set {1,2,...,n} ~ {wi,...,wj-1}. 


This computation can become rather messy, since we must repeatedly scan through the 
remaining letters to determine the 7;th smallest one. Furthermore, the method is really no 
different from unranking a random integer between 0 and n! — 1. An alternative recursive 
generation method proceeds as follows. If n = 1, return w = 1 as the answer. If n > 1, 


first recursively generate a random permutation w’ = w{---w!,_, of {1,2,...,n—1}. Next, 
generate a random integer w,, € {1,2,...,n}. Now make a single scan through the previously 


chosen letters wi, and let w; = wi if wi < wn, wi = wi, + lif wi, > wn. This determines the 
final answer w = w ,W2--: Wy. We let the reader check that every permutation is equally 
likely to be generated when using this method. 


Summary 


e Definitions. Let S be a set of n objects. A ranking map for S is a bijection r: S — n, 
where n = {0,1,2,...,2—1}. An unranking map for S is a bijection u: n > S. Given 
a particular total ordering of the elements of S, a successor map for S' is a function that 
maps each z € S to the element immediately following z in the ordering (if any). 


e Biyective Sum Rule. Let S1,..., 5% be disjoint finite sets with union S. Given bijections 
fi : S; > nj, there is a bijection f = }0, fi: S — n (where n = nj +--+ +n) given by 


f(z)=Singt+ fiz) (we S)). 


<a 


The map f depends on the given ordering of the f;’s. To compute f~'(z), find the 
unique index i such that }0,2;nj <2 < oj<; mj, and let f7'(z) = fii@- Dy<i M3): 


e Biyective Product Rule. Given positive integers n,,...,nx% with product n, there is a 
bijection p = Pn, ,no,....n, 1 M1 X Mg X +++ X NK — n given by 


DP(C1, C2,-+-5Ck) = CiNg+ ++ NE + CoNZ +++ Ne +++ + CK-1NK + CK (0<G <n). 


To compute p~!(z), let q, and rz, be the quotient and remainder when z is divided by 
ny. Then let qx—1 and rz_, be the quotient and remainder when gq, is divided by nz_1. 
Continue similarly; then p~!(z) = (ri,r2,...,7r%). If all n;’s equal the same integer b, 
p_+(z) is the base-b expansion of z. 


e Ranking k-Permutations. Suppose we are ranking k-permutations w = w ,w2--- wer of 
an ordered alphabet A = (%0,%1,...,%n—1). To find r(w), first compute (j1,...,Jx) 
by letting j; be the number of letters preceding w; in the given ordering of A that 
are different from wi,...,w;-1. Then calculate r(w) = prjn—1,....n—k+1(J1;--+;Jk). To 
unrank z, apply p~' to recover (j1,...,j«), and then recover w1,...,w from left to 
right by letting w; be the (j; + 1)th smallest letter in A ~ {w1,..., wi-1}. 


e Rank Formula for Subsets. Suppose we are ranking k-element subsets of an ordered 
alphabet A = (%0,%1,-.-,2n-1). If B = {ai, < vi, < +++ < a4, }, then we can take 


r(B) = ey (3); We can unrank a given integer z by a greedy strategy that recovers 
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iz,.--,%1 (in this order) by choosing the largest possible value that will not cause the 
partial sum so far to exceed z. This ranking method leads to a listing of k-element 
subsets in which all subsets containing z+,_, appear after all subsets not containing 
Zn—1; each sublist is ordered in the same way relative to r,_2, etc. 


e Rank Formula for Anagrams. The following recursive formula can be used to rank words 
w € Raj? ---az*) in alphabetical order: if w = a;w’, then 


r(w) =S0C(m +--+ +e —Iymi,...,nj—1,..., ne) #r(w'). 
j<i 


To unrank a given integer z, choose 7 as large as possible so that the sum in the previous 
formula does not exceed z; subtract the sum for this choice of 7 from z; unrank the result 
recursively; and prepend the letter a; to obtain the final answer. 


e Successor Algorithm for Anagrams. Consider all words in R(aj? --- az") in alphabetical 
order. To find the word immediately following w, first write w = a;w’. If w’ is not 
the last word in its rearrangement class, recursively compute its successor (say z’), and 
return a;z’ as the successor of w. Otherwise, find the first 7 > i with n; > 0, and return 
a,b’ as the successor of w, where b’ = aj -- ‘a ee ae 

e Random Selection Algorithms. Suppose we want to randomly select an object from a 
given set S. If an unranking map u: n — S is available, we can generate a random 
integer z € n and return u(z). Alternatively, if the objects in S can be built up in 
stages, we can make a random choice at each stage to decide how to build the object. 
For instance, to build a random k-subset B of {1,2,...,n}, we can include n in B with 
probability &/n, and then choose the remaining elements of B recursively. 


(ese "1 "1 
Exercises 


5.45. Suppose f : {a,b,c} — 3 and g : {d,e} — 2 are defined by f(a) = 1, f(b) = 2, 
f(c) = 0, g(d) = 1, g(e) = 0. Compute the bijections f +g and g+ f. 


5.46. Compute (a) p7.5(4,3); (b) p7,5(3,4); (c) ps,7(4,3); (d) ps,7(3, 4); (e) p75(22); (£) 
D5,7(22). 


5.47. Find (a) p2,2,2,2,2(0, 1, 1,0, 1); (b) Do.2,,2,2(29); (c) P7,7,7(3, 0, 6); (d) P7.7,7(306); (e) 
Po,10,10 (306). 


5.48. Compute: (a) p5,4,3,2,1(3, 3,0, 1,0); (b) p5.4.3.2,1(111); (c) p3,6,2,6(2, 5, 0, 4); 
(d) 15.6,2,6(150); (e) P6,2,6,3(150); (f) 16,6,3,2(150). 


5.49. Consider the product set X = 3 x 4. (a) View X as the disjoint union of the sets 
X; = {i} x 4, for i = 0,1,2. Let f; : X; — 4 be the bijection f;(z,y) = y. Compute the 
bijections fo + fi + fo and fo+ fi + fo, which map X to 12. (b) View X as the disjoint 
union of the sets X“ = 3 x {j}, for 7 = 0,1,2,3. Let gj: X — 8 be the bijection 
g;(z,j) =x. Compute the bijection go + 91 + g2 + 93 : X — 12. (c) Compute the bijection 
3,4: X — 12. Is this one of the maps found in (a) or (b)? (d) Let t: X — 4 x 3 be the 
bijection t(z,7) = (7,7). Compute the bijection p43 ot: X — 12. Is this one of the maps 
found in (a) or (b)? 
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5.50. Rank the following four-letter words: (a) alto; (b) zone; (c) rank; (d) four; (e) word. 


5.51. Unrank the following numbers in 264 to obtain four-letter words: (a) 115,287; (b) 
396, 588; (c) 392,581; (d) 338, 902; (e) 275, 497. 


5.52. (a) Rank the six-letter word “unrank.” (b) Unrank 199,247,301 to get a 6-letter word. 
(c) What happens if we unrank 199,247,301 to get a k-letter word where k > 6? 


5.53. A fraternity name consists of either two or three capital Greek letters. Recall that 
there are 24 letters in the Greek alphabet, ordered as follows: 


ABTAEZHOIKAMN=OUPXTYTOXWN. 


Assume an ordering of fraternity names consisting of all two-letter names in alphabetical 
order, followed by all three-letter names in alphabetical order. Compute the rank of (a) 
®BK; (b) AA; (c) AAA; (d) AXQ. Now, unrank: (e) 144; (f) 1440; (g) 13931. 


5.54. Repeat (a)—(g) in 5.53, assuming the names are ordered so that all three-letter names 
precede all two-letter names, with names of each length in alphabetical order. 


5.55. Repeat (a)—(g) in 5.53, assuming the names are ordered in alphabetical order (so that, 
for example, AA is immediately preceded by ATQ and immediately followed by AAA). 


5.56. Consider the set of four-digit even numbers (no leading zeroes allowed) that do not 
contain the digit 6. (a) Use the product rule to count this set. (b) Find a ranking bijection 
that will list these numbers in increasing numerical order. (c) Use (b) to rank 1234, 2500, 
and 9708. (d) Now unrank 1234, 2501, and 666. 


5.57. Consider five-letter palindromes, ranked in alphabetical order. (a) Rank the palin- 
dromes LEVEL and MADAM. (b) Unrank 1581 and 12,662. (c) Find the first and last 
palindromes in the ranking that are real English words. 


5.58. A Virginia license plate consists of three uppercase letters followed by four digits. 
For arcane bureaucratic reasons, license plate 0 is ZZZ-9999, followed by ZZZ-9998, ..., 
ZZZ-0000, ZZY-9999, etc. Use this system to rank the license plates: (a) ZCF-2073; (b) 
JXB-2007; (c) ABC-1234. Now unrank: (d) 7,777,777; (e) 123,456,789. 


5.59. Repeat the previous exercise assuming a new ordering, honoring the 400th anniver- 
sary of Jamestown, where license plate 0 is JAM-1607, and license plates count forward in 
lexicographic order (“wrapping around” from ZZZ-9999 to AAA-0000). 


5.60. Let A = {a,b,c,d,e, f}. (a) Compute the ranks of bfdc and fdac among all 4- 
permutations of A. (b) Unrank 232 to get a 4-permutation of A. (c) Compute the rank 
of ecafdb among all permutations of A. (d) Unrank 583 to get a permutation of A. 


5.61. (a) Compute the rank of 42153 among all permutations of {1,2,...,5}. (b) Unrank 
46 to obtain a permutation of {1,2,...,5}. 


5.62. (a) Compute the rank of 36281745 among all permutations of {1,2,...,8}. (b) Unrank 
23,419 to obtain a permutation of {1,2,...,8}. 


5.63. Let A = {a,b,c,d,e, f,g,h}. (a) Use the ranking formula for $4(A) in §5.6 to rank the 
subsets {a,c,e,g}, {b, c,d, h}, and {d,e, f, h}. (b) Unrank 30, 40, and 50 to obtain 4-element 
subsets of A. 
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5.64. (a) Devise a ranking algorithm for k-element subsets of an n-element alphabet based 
on the recursion C(n, k) = C(n—1, kK—-1)+C(n—1, k), which differs from the recursion in §5.6 
due to the reversal of the order of terms on the right side. (b) Describe informally the order 
in which the ranking algorithm in (a) will produce the k-element subsets. (c) Answer the 
ranking and unranking questions in the previous exercise using this new ranking algorithm. 


5.65. (a) Find the ranks of bbccacba and cabcabbc in the set R(a?b%c?), ordered alpha- 
betically. (b) Unrank 206 and 497 to get anagrams in R(a7b%c?). 


5.66. (a) Compute the rank of MISSISSIPPI among the set of all anagrams in R(I* M P?S*) 
(listed alphabetically). (b) Which anagram in this set has rank 33,333? 


5.67. (a) Use the rank functions r,, in §5.8 to rank the integer partitions (3, 3,3), (5, 2,2), 
and (4, 3, 2; 1). (b) Compute u12,3(6), U15,4(22), and u20,6(47). 


5.68. Use the rank function rg.4 from 85.8 to list all integer partitions of 8 into 4 parts. 
5.69. Enumerate all the integer partitions of 7, following the method used in 5.27. 
5.70. Use the rank functions from §5.9 to list all set partitions of {1, 2,3, 4,5} into 3 blocks. 


5.71. (a) Use the algorithms in §5.9 to rank the following set partitions relative to the 
set SP(n,k): {{1, 3}, {2, 4, 5}}; {{1, 5, 7}, {2}, {3, 4, 8}, {6}}. (b) Unrank 247 to obtain a set 
partition in SP(7,4). (c) Unrank 1492 to obtain a set partition in SP(8, 4). 


5.72. (a) Rank the four-of-a-kind hand {3@, 89, 80, 8@, 8} (see 5.29). (b) Unrank 264 to 
get a four-of-a-kind hand. 


5.73. (a) Rank the full house hand {3@, 39,30, 9@, 9} (see 5.30). (b) Unrank 3082 to 
get a full house hand. 


5.74. (a) Rank the full house hand {A@, AY, AO, KU, Kd} (see 5.30). (b) Unrank 483 to 
get a full house hand. 


5.75. (a) Rank the two-pair hand {AQ, AU, 7@, QO, Q@} (see 5.31). (b) Unrank 71,031 to 
get a two-pair hand. 


5.76. (a) Rank the hand {5&, 7, 8@, 109, JV} among all possible poker hands. (b) Rank 
the same hand among all “ordinary” poker hands (see 5.32). (c) Unrank 1,159,403 to get 
one of the (°?) possible poker hands. (d) Unrank 1,159,403 to get an ordinary poker hand. 
5.77. Use the ranking algorithm in §5.11 to list all Dyck paths of order (a) 4; (b) 5. 

5.78. (a) Rank the Dyck path NNNENEENNEENNENEEE (see §5.11). (b) Unrank 52 to 
get a Dyck path of order 6. (c) Unrank 335 to get a Dyck path of order 7. 


5.79. (a) Use the method of §5.12 to find the rank of the rooted tree 
= 1h, 1), (2, 1), (3, 1), (4, 1), (10, 1), (7, 1), (6, is (5, 6), (8, 5), (11, 6), (9, 11) je 
(b) Unrank 1,609,765 to obtain a rooted tree on 9 vertices rooted at vertex 1. 


5.80. Use the algorithm in §5.13 to find the successor of each word in the appropriate set 
of anagrams: (a) ccbabdc; (b) 3641275; (c) 01101011; (d) 33212312; (e) UKULELE. 


5.81. Write an algorithm to find the predecessor of a given word w € R(aj'---a;*) in 
the alphabetical ordering of anagrams. Use this to find the predecessor of each word in the 
previous exercise. 
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5.82. (a) Use the successor algorithm in §5.13 to find the first four successors of the Dyck 
path NNNENEENNEENNENEEE. (b) Write a predecessor algorithm for Dyck paths. (c) 
Use (b) to compute the first four predecessors of the Dyck path in (a). 


5.83. Give careful proofs of the bijective sum rules (5.1 and 5.2). 


5.84. Prove that for all a € Z and all nonzero b € Z, there exist unique g,r € Z with 
a=bq+rand0<~r < |b]. Describe an algorithm for computing g and r given a and b. 


5.85. Prove that for all a € Z and all nonzero b € Z, there exist unique g,r € Z with 
a = bg+r and —|b|/2 <r < |b|/2. Describe an algorithm for computing gq and r given a 
and b. 


5.86. Suppose 0 4 b € Z and S$ C Z. Find a necessary and sufficient condition on S$ that 
will make the following statement true: for all a € Z, there exist unique g,r € Z with 
a=bq+r andr €S. How could one find gq and r given a, b, and S? 


5.87. Division Algorithm for Polynomials. (a) Suppose F is a field, g € Fa] is a 
nonzero polynomial, and f € F[z] is any polynomial. Prove there exist unique polynomials 
qg,r € F[x] with f = ¢g +r and either r = 0 or deg(r) < deg(g). Describe an algorithm for 
computing g and r given f and g. (b) Show that (a) can fail if F is a commutative ring that 
is not a field. (c) Is (a) true for all commutative rings F' if we assume g is monic? 


5.88. Verify 5.14. 


5.89. Given a,b € Z with b > 0, let a mod b be the unique remainder r € b such that 
a = bg+r for some q € Z. (a) Given s,t > 0, consider the map f : st — s x t given 
by f(z) = (a mod s,xz mod t) for x € st. Prove that f is a bijection iff gcd(s,t) = 1. (b) 
Generalize (a) to maps from $182 °--S_ tO $1 X Sg X -+* X SK. 


5.90. Complexity of Binary Arithmetic. Let x and y be k-bit numbers (this means 
the base-2 expansions of x and y have zeroes beyond the first k positions). (a) Show that 
there is an algorithm to compute the base-2 expansion of x + y (or x — y) that requires 
at most ck bit operations, for some constant c. (b) Show that the base-2 expansion of wy 
can be computed in at most ck? bit operations, for some constant c. (See 7.175 for a faster 
method.) (c) If y > 0, there exist unique g,r € Z with x = qyt+rand0<r< y. Show that 
q and r can be computed from x and y in at most ck? bit operations, for some constant c. 


5.91. Binary Exponentiation. Suppose n is a k-bit number, x € n, and e is an m-bit 
number. Show that we can compute x° mod n in at most ck?m bit operations, for some 
constant c. 


5.92. Euclid’s Algorithm for Computing GCD’s. (a) Show that for all nonzero 
x € Z, gcd(x,0) = x. Show that if z,y,q,r € Z satisfy y 4 0 and « = qy+r then 
gcd(x, y) = ged(y,r). (b) Use (a) and 5.85 to develop an algorithm that will compute the 
gcd of two n-bit numbers using at most n integer divisions (hence at most cn? bit operations, 
for some constant c). 


5.93. (a) Devise a ranking algorithm for k-element multisets of an n-element ordered al- 
phabet based on the recursion 2.26. Use this to rank the multiset [b, b,c,d,d,d| over the 
alphabet {a, b,c, d,e} and to unrank 132 to get a 6-element multiset over this alphabet. (b) 
Repeat part (a), but use a ranking algorithm based on one of the bijections in §1.11. 


5.94. Fix k € NT. Prove that every m € N can be written in exactly one way in the form 
m = doy (4), where 0 < iy <ig < +++ < ip. 
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5.95. Fix k € N+. Use 5.94 to find an explicit formula for a bijection f : N* — N (cf. 1.149). 


5.96. (a) Devise a ranking algorithm for derangements based on the recursion 4.24. (b) 
List all derangements of {1,2,3,4} in the order specified by your ranking algorithm. (c) 
Compute the rank of 3527614 € D7. (d) Unrank 1776 to obtain a derangement in D7. 


5.97. Devise algorithms to rank and unrank partitions of n into k distinct parts. Use your 
algorithms to rank the partition (10,7,6,3,1) and unrank 10 to get a partition of 20 into 3 
distinct parts. 


5.98. Suppose we rewrite the recursion for Stirling numbers in the form 


S(n,k) = S(n—1,k)k + S(n—1,k-1) (n,k > 0). 


(a) Use the bijective product and sum rules (taking terms in the order written here) to devise 
ranking and unranking algorithms for set partitions in SP(n,k). (b) Rank the partition 
mw = {{1,7}, {2, 4,5}, {3, 8}, {6}} and unrank 111 to obtain an element of SP(7,3) (cf. 5.28). 
(c) Repeat 5.71 using the new ranking algorithm. 


5.99. Use the recursion in 2.53 to develop ranking and unranking algorithms for the set 
SP(n) of all set partitions of an n-element set. Find the rank of {{1, 2,4}, {3, 5, 6}, {7, 8}}. 
Which set partition of 8 has rank 1394? 


5.100. Find ranking and unranking algorithms for three-of-a-kind poker hands. Use these 
algorithms to rank the three-of-a-kind hand {49, 4@, 4, 9&, A@} and to unrank 21,751. 


5.101. Find ranking and unranking algorithms for one-pair poker hands. Use these algo- 
rithms to rank the one-pair hand {29, 24,7, 8, 10} and to unrank 497,079. 


5.102. (a) Find ranking and unranking algorithms for straight poker hands (including 
straight flushes). Use these algorithms to rank the straight hand {49, 59, 6&, 7,80} and 
to unrank 1574. (b) Repeat part (a) for the set of straight hands that are not flushes. 


5.103. (a) Find ranking and unranking algorithms for flush poker hands (including straight 
flushes). Use these algorithms to rank the flush hand {39, 79, 109, JV, KY} and to unrank 
4716. (b) Repeat part (a) for the set of flush hands that are not straights. 


5.104. Develop ranking and unranking algorithms for 231-avoiding permutations. Find the 
rank of w = 1524311761089. Unrank 231 to get a 231-avoiding permutation of length 7. 


5.105. Develop ranking and unranking algorithms for the set of subsets of {1,2,...,n} 
that do not contain two consecutive integers (cf. 2.130(b)). For n = 10, rank the subset 
{2,5,7,10} and unrank 42. 


5.106. (a) Use the pruning bijections in §3.12 to develop ranking and unranking algorithms 
for the set of trees with vertex set {v1,..., Un} such that deg(vj) = d; for all 7 (where the 
d; are positive integers summing to 2n — 2). (b) Given (di,...,d9) = (1,2,1,1,1,2,3,1,4), 
find the rank of the tree shown in Figure 3.16. (c) Unrank 129 to obtain a tree with the 
degrees d; from part (b). 


5.107. Write a successor algorithm for listing integer partitions of n into k parts in lexi- 
cographic order (see 10.36). Use your algorithm to find the successors of (9,3), (7,4, 2,1), 
and (3,3, 1,1, 1). 


5.108. Write a successor algorithm for listing all integer partitions of n in lexicographic or- 
der (see 10.36). Use your algorithm to find the sucessors of (9,3), (7,4, 2, 1), and (3,3, 1,1, 1). 
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5.109. Write a successor algorithm for listing set partitions of {1,2,...,n} into & blocks, 
using any convenient ordering. Find the successor and predecessor of the set partition 


{{1, 7}, {2}, 13,5, 6}, {4, 8h}. 


5.110. (a) Write a successor algorithm for listing full-house poker hands, using any 
convenient ordering. (b) Use your algorithm to determine the successor of the hand 
{ Jd, J, J&, 9%, 90}. (c) What is the predecessor of the hand in (b)? 


5.111. (a) Write a successor algorithm for listing one-pair poker hands, using any convenient 
ordering. (b) Use your algorithm to find the successor of the hand {20, 2&, 70, 80,10}. 
(c) What is the predecessor of the hand in (b)? 


5.112. Describe a successor algorithm for ranking rooted trees with vertex set {1,2,...,n} 
rooted at vertex 1. Compute the successor and predecessor of the tree shown in Figure 3.9. 


5.113. Devise a random selection algorithm for choosing anagrams in R(a}''--- az"). 


5.114. (a) Devise a random selection algorithm for choosing integer partitions of n into k 
parts. (b) Write an algorithm for choosing a random integer partition of n. 


5.115. Devise a random selection algorithm for choosing a derangement of n letters. 


5.116. Confirm that the random selection algorithm for permutations described at the end 
of 85.14 will generate every permutation in S;,, with equal probability. 


5.117. Consider the following proposed algorithm for randomly selecting a permutation 
w € Sy. Initially, set w; = 7 for 1 <i <n. Next, for 1 <7 <n, exchange w; and w,;, where 
j is chosen randomly in {1,2,...,n}. Does this method produce every element of S,, with 
equal probability? Explain. 


5.118. Modify the algorithm in the previous exercise by exchanging w; with w;, where 7 
is chosen randomly in {1,2,...,2} at each stage. Does this method produce every element 
of S,, with equal probability? Explain. 


5.119. Consider the following proposed algorithm for randomly selecting a permutation 
w € S,. Choose wy € {1,2,...,n} at random. For i = 2,...,n in turn, repeatedly choose 
w; € {1,2,...,n} at random until a value different from wi1,...,wj—-1 is obtained. Argue 
informally that this algorithm will produce every permutation in S,, with equal likelihood, 
but that the expected number of random choices needed to generate one permutation in S,, 
is 


n(14+1/24+1/3+---+1/n) = nInn. 


5.120. Devise a ranking algorithm for 4-letter words in which Q is always followed by U (so 
Q cannot be the last letter). Use your algorithm to rank AQUA and QUIT and to unrank 
1000. Can you find an algorithm that generates these words in alphabetical order? Can you 
generalize to n-letter words? 


5.121. Devise a ranking algorithm for 5-letter words that never have two consecutive vowels. 
Use your algorithm to rank BILBO and THIRD and to unrank 9999. Can you find an 
algorithm that generates these words in alphabetical order? Can you generalize to n-letter 
words? 
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Notes 


Our presentation of ranking and random selection places great emphasis on bijections con- 
structed automatically by repeated use of the bijective sum and product rules. For a some- 
what different approach based on a multigraph model, see the papers [141, 142]. Other 
discussions of ranking and related problems can be found in the texts [10, 100, 128]. An 
encyclopedic treatment of algorithms for generating combinatorial objects may be found in 
Knuth’s comprehensive treatise [78]. 


6 


Counting Weighted Objects 


In earlier chapters, we have spent a lot of time studying the counting problem: given a finite 
set S, how many elements does S have? This chapter generalizes the counting problem to 
the following situation. Given a finite set S of objects, where each object is assigned an 
integer-valued weight, how many objects in S are there of each given weight? A convenient 
way to present the answer to this question is via a generating function, which is a polynomial 
ag + 4,2 + a9x7 +--+ a,x" such that az (the coefficient of «*) is the number of objects 
in S of weight k. After giving the basic definitions, we will develop rules for manipulating 
generating functions that are analogous to the sum rule and product rule from Chapter 1. We 
will also derive formulas for certain generating functions that generalize factorials, binomial 
coefficients, and multinomial coefficients. In later chapters, we extend all of these ideas to 
the more general situation where S' is an infinite set of weighted objects. 


a 


6.1 Weighted Sets 


This section presents the basic definitions needed to discuss sets of weighted objects, to- 
gether with many examples. 


6.1. Definition: Weighted Sets. A weighted set is a pair (S,wt), where S is a set and 
wt : S — N is a function from S to the nonnegative integers. For each z € S, the integer 
wt(z) is called the weight of z. In this definition, S is not required to be finite, although we 
shall always make that assumption in this chapter. The weight function wt is also sometimes 
referred to as a statistic on S. If the weight function is understood from the context, we 
may sometimes refer to “the weighted set S.” 


6.2. Definition: Generating Function for a Weighted Set. Given a finite weighted 
set (S, wt), the generating function for S is the polynomial 


Ggiwt(“) = S- ght (2) 


ZES 


We also write Gs(x) or G(x) if the weight function and set are understood from con- 
text. Note that the sum on the right side is well-defined, since S is finite and addition of 
polynomials is an associative and commutative operation (see the discussion following 2.2 
and 2.149). 


6.3. Example. Suppose S = {a,b,c,d,e, f}, and wt : S > N is given by 
wt(a) = 4, wt(b) =1, wt(c) =0, wt(d) =4, wt(e) = 4, wt(f) =1. 
The generating function for (S, wt) is 


Gs,we(v) = ct) 4 pW) 4. Wt) = ot gt 4 og 4 a4 4 ot tat = 14 204 3ct. 
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Consider another weight function w : S — N given by w(a) = 0, w(b) = 1, w(c) = 2, 
w(d) = 3, w(e) = 4, and w(f) = 5. Using this weight function, we obtain a different 
generating function, namely 


Gow(z) =l+eter tee +at+a?®. 
6.4. Example. Suppose S is the set of all subsets of {1, 2,3}, so 


S = {0, {1}, {2}, {3}, (1, 2}, {1, 3}, {2, 3}, {1, 2, 3}. 


Consider three different weight functions w,; : S — N, given by 


wi(A) =|A];. w2(A) = 2% w3(A) = mini; (for all A € S$). 


(By convention, define w3(@) = 0.) Each of these statistics leads to a different generating 
function: 


Gs (2) = et+atat¢al tet te? te? 402 =143043e7%4+27° = (1+ x); 
Gsu(e) = +e +27? +e%+2° atta? + eae es 
Gs,w3(%) = etaeitet*tetaltal te? tel =14 404 227 42°. 


6.5. Example. For each integer n > 0, consider the set n = {0,1,2,...,n — 1}. Define 
a weight function on this set by letting wt(i) = 7 for all 7 € n. The associated generating 
function is n_4 

gn — 


Ips, 02 ik 
Guwt(t) = 2° +2 ba fee+ +a” eae: 


The last equality can be verified by using the distributive law to calculate 


(go —1)( feta? +--+2% 1) Se" = 1. 


The generating function in this example will be a recurring building block in our later work, 
so we give it a special name. 


6.6. Definition: Quantum Integers. If n is a positive integer and x is any variable, we 
define 


gee, 


z—-1- 


[nle Hl tater te. +e" = 


We also define [0], = 0. The polynomial [n], is called the quantum integer n (relative to 
the variable x). 


6.7. Example. Let S be the set of all lattice paths from (0,0) to (2,3). For P € S, let 
w(P) be the number of unit squares in the region bounded by P, the z-axis, and the line 
x = 2. Let w'(P) be the number of unit squares in the region bounded by P, the y-axis, 
and the line y = 3. By examining the paths in Figure 1.7, we compute 


Gou(zy = a8 +0? +at+at+e% +2? +e% 427% +e +2 


l| 
a 
a 
a 
8 
8 
8 
8 
8 
a 
8 


+2 


Gs,w! (x) 


Although the two weight functions are not equal (since there are paths P with w(P) # 
w’(P)), it happens that Gs,~ = Gs,w in this example. 
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Now, consider the set T of Dyck paths from (0,0) to (3,3). For P € T, let wt(P) be 
the number of complete unit squares located between P and the diagonal line y = x. Using 
Figure 1.8, we find that 


Grws(x) = 2° tertaet+alt+e° =14274+27%4+23. 


6.8. Remark. Let (S,wt) be a finite set of weighted objects. We know Gg(xr) = 
re 3 «“*(), By collecting together equal powers of « (as done in the calculations above), 
we can write Gg in the standard form 


Gg (x) = apr + ayx" + aga? + +++ + Gm2™ (a; € N). 
Comparing the two formulas for Gs, we see that the coefficient a; of x in Gg .wt(x) is the 
number of objects z in S such that wt(z) = 7. We now illustrate this observation with several 
examples. 
6.9. Example. Suppose T is the set of all set partitions of an n-element set, and the weight 


of a partition is the number of blocks in the partition. By definition of the Stirling number 
of the second kind (see 2.51), we have 


S(n, ka. 


i 
[le 


Gr(z) 


Similarly, if U is the set of all permutations of n elements, weighted by the number of cycles 
in the disjoint cycle decomposition, then 


s'(n, k)x* 


l 
Ms 


Gu(«) 


where s’(n,k) is a signless Stirling number of the first kind (see §3.6). Finally, if V is the 
set of all integer partitions of n, weighted by number of parts, then 


n 


Gy (a) = S> p(n, k)a*. 


k=0 


This is also the generating function for V if we weight a partition by the length of its largest 
part (§2.8). 


6.10. Remark. Suppose we replace the variable x in Gg(a) by the value 1. We obtain 
Gs(1) = Dyeg 1 = Deg 1 = |S]. For example, in 6.7, Grwe(1) = 5 = C3; in 6.9, 
Gr(1) = B(n) (the Bell number), Gy (1) = n!, and Gy(1) = p(n). Thus, the generating 
function G'g(a) can be viewed as a weighted analogue of “the number of elements in S.” 
On the other hand, using the convention that 0° = 1, Gs(0) is the number of objects in S 
having weight zero. 

We also note that the polynomial G's(x) can sometimes be factored or otherwise simpli- 
fied, as illustrated by the first weight function in 6.4. Different statistics on S usually lead 
to different generating functions, but this is not always true (see 6.4 and 6.7). 


Our goal in this chapter is to develop techniques for finding and manipulating generating 
functions that avoid listing all the objects in S, as we did in the examples above. 
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6.2 Inversions 


Before presenting the sum and product rules for generating functions, we introduce an 
example of a weight function that arises frequently in algebraic combinatorics. 


6.11. Definition: Inversions. Suppose w = w ,w2--: Wr is a word, where each letter w; is 
an integer. An inversion of w is a pair of indices ¢ < j such that w; > w;. We write inv(w) 
for the number of inversions of w; in other words, 


inv(w1W2+++Wn) = S- x(wi > wW;). 
l<i<j<n 


Thus inv(w) counts pairs of letters in w (not necessarily adjacent) that are out of numerical 
order. We also define Inv(w) to be the set of all inversion pairs (i, 7), so | Inv(w)| = inv(w). 
If S is any finite set of words over the alphabet Z, then 


Gs inv (2) = S- ginv(w) 
wes 


is the inversion generating function for S. These definitions extend to words over any totally 
ordered alphabet. 


6.12. Example. Consider the word w = 414253; here w; = 4, wg = 1, w3 = 4, etc. The 
pair (1,2) is an inversion of w since w; = 4 > 1 = we. The pair (2,3) is not an inversion, 
since we = 1 < 4 = w3. Similarly, (1,3) is not an inversion. Continuing in this way, we find 
that 


Inv(w) = {(1,2), (1,4), (1,6); 8,4), 8,6), (,6)}, 
so inv(w) = 6. 
6.13. Example. Let S be the set of all permutations of {1,2,3}. We know that 
SS (108, 139,913 2415312; 301) 
Counting inversions, we conclude that 
Gsinv(x) = 2° tet tata? +e?4+a3 =14 27420? +22 =1(114+2)(1 +4427). 


Note that Gg(1) = 6 = 3! = |S]. Similarly, if T is the set of all permutations of {1, 2,3, 4}, 
a longer calculation (see 6.50) leads to 


Grinv(z) = 1+ 3a + 5a? + 623 + 5a4 4 32° +¢% = 11 +2)\(ltete2*)\(l+e+a7% 42%). 
The factorization patterns in these examples will be explained and generalized below. 


6.14. Example. Let S = R(071°) be the set of all rearrangements of two zeroes and three 
ones. We know that 


S = {00111, 01011, 01101, 01110, 10011, 10101, 10110, 11001, 11010, 11100}. 


Counting inversions, we conclude that 


Gsin(a) Sao +e t+a2 too te? tef +ettet tet ta’ = 14+ et 22? +20? 420*4+0°+2°. 


The reader may notice that this is the same generating function that appeared in 6.7. This 
is not a coincidence; we explain why this happens in the next section. 
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6.15. Example. Let S = R(a'b'c?), where we use a < b < c as the ordering of the 
alphabet. We know that 


S = {abcc, acbc, accb, bacc, beac, beca, cabc, cacb, cbac, cbca, ccab, ccba}. 
Counting inversions leads to 
Gs inv(z) = 14 2a + 8a? 4 3a + Qa* + 2°. 


Now let T = R(atb?c') and U = R(a?b'c') with the same ordering of the alphabet. The 
reader is invited to confirm that 


Gsinv(2) = Grinv(@) = Guinv(2), 


although the sets of words in question are all different. This phenomenon will also be 
explained in the coming sections. 


6.16. Remark. It can be shown that for any word w, inv(w) is the minimum number of 
transpositions of adjacent letters required to sort the letters of w into weakly increasing 
order (see 9.29 and 9.179). 


DS 


6.3 Weight-Preserving Bijections 


In the next few sections, we introduce three fundamental rules that we will use to give 
combinatorial derivations of many generating function formulas. These rules are weighted 
analogues of the counting rules studied in Chapter 1. The first rule generalizes 1.30. We 
need one new definition to state this rule. 


6.17. Definition: Weight-Preserving Bijections. Let (S,wi) and (T,w2) be two 
weighted sets. A weight-preserving bijection from (S,w1) to (T, we) is a bijection f : S > T 
such that 

wa(f(z)) = wi(z) for all ze S. 


6.18. Theorem: Bijection Rule for Generating Functions. Suppose (S,w) and 
(T,w2) are two finite weighted sets such that there exists a weight-preserving bijection 
f:S—-T. Then 

G's,wi (x) = GT,ws (x). 


Proof. Let g: T — S be the inverse of f. One verifies that g is a weight-preserving bijection, 
since f is. For each k > 0, let 5, = {z € S: wi(z) =k} and Ty = {u € T : wo(u) = k}. 
Since f and g preserve weights, they restrict to give maps f, : Sp — Ty and gp: Tk — Sz 
that are mutual inverses. Therefore, |Sj,| = |Z}| for all k > 0. It follows that 


Gaga 5 (Sele). |e” S Grae). oO 


k>0 k>0 


6.19. Example. Let S be the set of all lattice paths P from (0,0) to (a,b), and let area(P) 
be the area below the path and above the z-axis (cf. 6.7). Let T = R(0°1°) be the set 
of all words consisting of a zeroes and b ones, weighted by inversions. There is a bijection 
g: I — S obtained by converting zeroes to east steps and ones to north steps. By examining 
a picture, one sees that inv(w) = area(g(w)) for all w € T. For example, if w = 1001010, 
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w = 1001010 ———— 


FIGURE 6.1 
Inversions of a word vs. area under a lattice path. 


then g(w) is the lattice path shown in Figure 6.1. The four area cells in the lowest row 
correspond to the inversions between the first 1 in w and the four zeroes occurring later. 
Similarly, the two area cells in the next lowest row come from the inversions between the 
second 1 in w and the two zeroes occurring later. Since g is a weight-preserving bijection, 
we conclude that 
G'rinv (x) = Go area (2). 

When a = 2 and b = 3, this explains the equality of generating functions observed in 6.7 
and 6.14. In 6.7, we also considered another weight on paths P € S, namely the number of 
area squares between P and the y-axis. Denoting this weight by area’, we have 


Gs area(£) = Gs. area! (x) 


(for arbitrary a and b). This follows from the weight-preserving bijection rule, since rotating 
a path 180° about (a/2, b/2) defines a bijection r : S — S that sends area to area’. Similarly, 
letting S’ be the set of paths from (0,0) to (b,a), we have 


Gs area(L) = Gg) area’ (x) (= Gg avea(£)) 


since reflection in the diagonal line y = x defines a weight-preserving bijection from (S, area) 
to (S’, area’). We will soon derive an explicit formula for the generating functions occurring 
in this example (86.7). 


DS 


6.4 Sum and Product Rules for Weighted Sets 
Now we discuss the weighted analogues of the sum and product rules. 


6.20. Theorem: Sum Rule for Weighted Sets. Suppose ($1,wt1), (S2,wta), ..., 
(S;,wt,) are finite weighted sets such that $),...,S, are pairwise disjoint sets. Let 
S = S,U---US;, and define wt : S — N by setting wt(z) = wt;(z) for all z € Sj. 
Then 

Gs,wt(2) om Gs, wt, (x) + G's2,wts (x) Ty eae G's,,.wt, (zy: 


Proof. By definition, Gswt(x) = >,¢52"'). Because addition of polynomials is commu- 
tative and associative, we can order the terms of this sum so that all objects in S; come 
first, followed by objects in Sz, and so on, ending with all objects in S;. We obtain 


Gswt(2) = Ss gt) 4 S- et) 4 S- pwt(2) 


ZzESy ZzESo zeS, 
= S gti(z) S ete) 4p S gw te (2) 
zESy zES2 z€Skz 


a Gs, wt, (x) + G's,,wts (x) 5 eae G's,,wtn (x). oO 


Counting Weighted Objects 219 


6.21. Theorem: Product Rule for Two Weighted Sets. Suppose (T, w 1) and (U, w2) 
are two finite weighted sets. On the product set S = T x U, define a weight w by setting 
w((t,u)) = wi(t) + we(u) for t € T and u € U. Then 


Ge alt) > GCrww, (x) : Gu,we (x). 


Proof. The proof consists of the following calculation, which requires the generalized dis- 
tributive law (in the form given by 2.5): 


Gsw(x) — ba gw((tu)) = ye wi (t) +we(u) 


(t,ujES (t,uj)ETXU 
oa a ur) , qw2u) 
(t,uj)ET XU 


= Se) ; ) 
teT ucU 


— GTw, (x) ‘ Gu,w» (x). O 


6.22. Theorem: Product Rule for k Weighted Sets. Suppose that (S;,w;) is a finite 
weighted set for 1 <7 < k. On the product set S = S; x Sg x --- x Sx, define a weight w 
by w((21,---,2k)) = ar wi(z:) for all z; € S;. Then 


k 
Gsyw(x) = II Gs,,w;(2). 


Proof. This formula follows from the product rule for two weighted sets by induction on 
k. Alternatively, one can mimic the proof given above for two sets, this time using the 
full-blown version of the generalized distributive law (see 2.6). Oo 


When discussing a Cartesian product of weighted sets, we always use the weight function 
on the product set given in the statement of the product rule (i.e., the weight is the sum of 
the weights of the component objects) unless otherwise specified. 


6.23. Example. Given a positive integer n, consider the n weighted sets S$; = i = 
{0,1,...,4-—1} (for 1 <i <n), where wt(z) = z for each z € i. We have seen in 6.5 that 
Gs,(x) = [t]z, the quantum integer 7. Now consider the product set S = $1 x S2x---x Sy = 
1x2.x---xn. By the product rule, the generating function for S is 


Since Gg(1) = |S| = n!, the polynomial G's(x) is a weighted analogue of a factorial. This 
polynomial will arise frequently, so we give it a special name. 


6.24. Definition: Quantum Factorials. For each n > 1 and every variable x, define the 
quantum factorial of n relative to x to be the polynomial 


irlle = [lo = []0 +2 +2? ++ +a) =P 


Also define [0]!. = 1. 


Observe that [n]!. = [n — 1]!2[n]2 for all n > 1. 
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6.25. Example. We have [0]!, = 1 = [1]!.; [2]!. = [2], =1+2; 
(3]!, = (1+2)(1+2+2") =14 22 +22? +2°; 


(4]n= 1 +a) ta+e7)1+e4+a7 +09) = 1432 +527 + 62° + 52* + 32° + 2°: 
[Bll = 1+ 4ar-+ 9a? + 1529 + 2024 + 220° + 202° + 15a? + 9n° + do? + 21°. 


We can use other variables besides 2; for instance, [3]!, = 1+ 2q¢ + 2q? + q?. Sometimes we 
will replace the variable here by a specific integer or real number; then the quantum factorial 
will evaluate to some specific number. For example, when q = 4, [3]!, = 1+8+32+64 = 105. 
As another example, when x = 1, [n]!, = n!. 


DT 


6.5 Inversions and Quantum Factorials 


The reader may have recognized some of the quantum factorial polynomials above as match- 
ing the inversion generating functions for permutations in 6.13. The next theorem proves 
that this pattern holds in general. 


6.26. Theorem: Quantum Factorials and Inversions. For every n > 0, let S, be the 
set of all permutations of {1,2,...,n}, weighted by inversions. Then 


Gg. sage) = S- ginv(w) — [n]!a- 


weSn 


Proof. Let T, = 1 x 2 x -:- x n with the usual product weight; we saw in 6.23 that 
Gr, wt(x) = [n]!,. Therefore, to prove the theorem, it suffices to define a weight-preserving 
bijection fn : Sn — Tn. 

Let w = wi w2--- Wn € S, be a permutation of {1,2,...,n}. For each k between 1 and 
n, define t, to be the number of pairs 7 < j such that w; > w; and w; = k; then define 
fn(w) = (ti, te,...,tn). In other words, t, = |{(i,7) € Inv(w) : w; = &}| is the number of 
inversions that the symbol k has with smaller symbols to its right. There are k — 1 possible 
symbols less than k, so t, € {0,1,2,...,4 —1} =k for every k. Thus, f,(w) does lie in the 
set T,,. For example, if w = 4,2,8,5,1,6,7,3, then f,(w) = (0,1,0,3,2,1,1,5). We have 
ts = 2, for instance, because of the two inversions (4, 5) and (4, 8) caused by the entries 5 > 1 
and 5 > 3 in w. Every inversion of w is counted by exactly one of the numbers tz, so that 
inv(w) = o¢_, te = wt(fn(w)) for all w € S,. This shows that f, is a weight-preserving 
map. 

To show that f, is a bijection, we display a two-sided inverse map gn : Ty, — Syn. We 
define g, by means of a recursive insertion procedure. The cases n < 1 are immediate since 
the sets involved have only one element. Assume n > 1 and gp_ 1 has already been defined. 
Given (t1,...,t) € Th = Tn-1 X n, begin by computing v = gn—1(t1,..-,tn—1), which is 
a permutation of {1,2,...,n—1}. To find g,(t1,...,tn), we insert the symbol n into the 
permutation v in such a way that n will cause t,, new inversions. This can always be done in 
a unique way. For, there are n possible positions in v where the symbol n could be inserted 
(after the last letter, or immediately before one of the n — 1 existing letters). If we insert 
n after the last letter, it will create no new inversions. Scanning to the left, if we insert n 
immediately before the kth letter from the far right, then this insertion will cause exactly k 
new inversions, since n exceeds all letters to its right, and no letter to the left of n exceeds 
n. Thus, the different insertion positions for n lead to 0,1,..., or m — 1 new inversions, 
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and this is exactly the range of values for t,,. The recursive construction ensures that, for 
all & <n, ty is the number of inversion pairs involving the symbol & on the left and some 
smaller symbol on the right. One may check that g, is the two-sided inverse of fy. 

Here is an iterative description of the computation of g,(t1,...,¢,). Beginning with 
an empty word, successively insert 1,2,...,n. At stage k, insert k in the unique po- 
sition that will increase the total inversion count by t,. For example, let us compute 
gg(0,0,1,3,2,1,5,5). In the first two steps, we generate 1, 2, which has zero inversions. Then 
we place the 3 one position left of the far right slot, obtaining the permutation 1, 3,2 with one 
inversion. Next we count three positions from the far right (arriving at the far left), and in- 
sert 4 to obtain the permutation 4, 1,3, 2 with three new inversions and four total inversions. 
The process continues, leading to 4,1,5,3,2; then 4,1,5,3,6,2; then 4, 7,1,5,3,6,2; and fi- 
nally to w = 4,7,8,1,5,3,6,2. The reader may check that fg(w) = (0,0,1,3,2,1,5,5). O 


Because of the previous proof, we sometimes call the elements of T,, inversion tables. 
Other types of inversion tables for permutations can be constructed by classifying the in- 
versions of w in different ways. For example, our proof classified inversions by the value of 
the leftmost symbol in the inversion pair. One can also classify inversions using the value of 
the rightmost symbol, the position of the leftmost symbol, or the position of the rightmost 
symbol. These possibilities are explored in the exercises. 


DT 


6.6 Descents and Major Index 


This section introduces more statistics on words, which will lead to another combinatorial 
interpretation for the quantum factorial [n]!<. 


6.27. Definition: Descents and Major Index. Let w = wiw2:--wpn be a word over a 
totally ordered alphabet A. The descent set of w, denoted Des(w), is the set of all i <n 
such that w; > wi+1. This is the set of positions in w where a letter is immediately followed 
by a smaller letter. Define the descent count for w by des(w) = | Des(w)|. Define the major 
index of w, denoted maj(w), to be the sum of the elements of the set Des(w). In symbols, 
we can write 


n-1 n-1 
des(w) = S- x(w; > wi41)3 maj(w) = ix (wi > wi41)- 
i=1 i=1 


6.28. Example. If w = 47815362, then Des(w) = {3,5,7}, des(w) = 3, and maj(w) = 
34+5+7=15. If w = 101100101, then Des(w) = {1,4, 7}, des(w) = 3, and maj(w) = 12. 
If w = 33555789, then Des(w) = 0, des(w) = 0, and maj(w) = 0. 


6.29. Theorem: Quantum Factorials and Major Index. For every n > 0, let S;, be 
the set of all permutations of {1,2,...,n}, weighted by major index. Then Gs, maj(x) = 


ese eS [n]!2. 


Proof. As in the case of inversions, it suffices to define a weight-preserving bijection fy, : 
S, —~ T, =1x2x---xn. We use a variation of the “inversion-table” idea, adapted to 
the major index statistic. Given w € S, and 0 < k < n, let w*) be the word obtained 
from w by erasing all letters larger than k. Then define f,(w) = (ti,to,...,tn), where 
ty = maj(w) — maj(w%-)). Intuitively, if we imagine building up w from the empty 
word by inserting 1,2,...,n in this order, then t, records the extra contribution to maj 
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caused by the insertion of the new symbol k. For example, given w = 42851673, we compute 
maj(e) = 0, maj(1) = 0, maj(21) = 1, maj(213) = 1, maj(4213) = 3, maj(42513) = 4, 
maj(425163) = 9, maj(4251673) = 10 and finally maj(w) = 15. It follows that fg(w) = 
(0,1,0,2,1,5,1,5). Observe that S77_, th = maj(w™) — maj(w) = maj(w), so the map 
fn is weight-preserving. 

We see from the definition that f,(w) = (fr—1(w~P), tn = maj(w)— maj(w'"-))). To 
show that f,(w) does lie in T,,, it suffices by induction to show that t, € n = {0,1,2,...,n— 
1}. Let us first consider an example. Suppose we wish to insert the new symbol 8 into the 
permutation w’ = 4 > 2,5 > 1,6,7 > 3, which satisfies maj(w’) = 1+3+6 = 10. There 
are eight gaps into which the symbol 8 might be placed. Let us compute the major index 
of each of the resulting permutations: 


maj(8 >4>2,5>1,6,7>3) =14+24+44+7 =14= maj(w’)+4; 
maj(4,8>2,5>1,6,7>3) =2+4+4+7 =13= maj(w’) +3; 
maj(4 > 2,8>5>1,6,7>3) =1+3+4+7 =15= maj(w’)+5; 
maj(4 > 2,5,8>1,6,7>3) =1+4+4+7 =12= maj(w’)+2; 
maj(4>2,5>1,8>6,7>3) =14+345+7 =16= maj(w’) +6: 
maj(4>2,5>1,6,8>7>3) =14+346+7 =17= maj(w’) +7; 
maj(4>2,5>1,6,7,8>3) =14+3+7 =11= maj(w’)+1; 
maj(4>2,5>1,6,7>3,8) =1+3+6 =10= maj(w')+0. 
Observe that the possible values of t,, are precisely 0,1,2,...,7 (in some order). 


To see that this always happens, suppose w‘"~") has descents at positions i; > ig > 
+++ > ta, where 1 < i; < n— 2 for all 2;. There are n gaps between the n — 1 letters in 
w—-)), including the positions at the far left and far right ends. Let us number these gaps 
0,1,2,...,n— 1 as follows. The gap at the far right end is numbered zero. Next, the gaps 
immediately to the right of the descents are numbered 1, 2,...,d from right to left. (We have 
chosen the indexing of the descent positions so that the gap between positions i; and 7; + 1 
receives the number j.) Then, the remaining gaps are numbered d+ 1,...,n—1 starting at 
the far left end and working to the right. In the example considered above, the gaps would 
be numbered as follows: 


a Ap De. Oe Se Og OK gk CF Se ade 5 
4 3 5 2 6 7 1 0 

Note that inserting the symbol 8 into the gap labeled 7 causes the major index to increase 
by exactly 7. 

Let us prove that this happens in the general case. If we insert n into the far right gap of 
w'"-) (which is labeled zero), there will be no new descents, so maj(w\”)) = maj(w'"-))+0 
as desired. Suppose we insert n into the gap labeled 7, where 1 < j < d. In w'"~), we had 
Wi; > Wi;+1; but the insertion of n changes this configuration to Wi; <2 > Wi,;4+1- This 
pushes the descent at position 7; one position to the right. Furthermore, the descents that 
formerly occurred at positions 7;-1,...,%1 (which are to the right of i;) also get pushed one 
position to the right because of the new symbol n. It follows that the major index increases 
by exactly 7, as desired. Finally, suppose d < 7 < n—1. Let the gap labeled j occur at 
position u in the new word, and let t be the number of descents in the old word preceding 
this gap. By definition of the gap labeling, we must have 7 = (u — t) + d. On the other 
hand, inserting n in this gap produces a new descent at position u, and pushes the (d — t) 
descents located to the right of position u one position further right. The net change to the 
major index is therefore u + (d —t) = j, as desired. 

We now know that t, € {0,1,2,...,k—1} for all k, so that f,, does map S,, into T,,. The 
preceding discussion also tells us how to invert the action of f,. Given (t1,...,tn) € Tn, 
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we use the t,’s to insert the numbers 1,...,n into an initially empty permutation. After 
1,...,4—1 have been inserted, we number the gaps according to the rules above and then 
insert & in the unique gap labeled t,. We just proved that this insertion will increase the 
major index by tx. It follows that the resulting permutation w € S, is the unique object 
satisfying f,(w) = (t1,...,tn). Therefore, f, is a bijection. Oo 


6.30. Example. Let us compute feo (0; 1,1,0,4,3). Using the insertion algorithm from 
the preceding proof, we generate the following sequence of permutations: 1; then 2,1; then 
2,3,1; then 2,3,1,4; then 2,3,1,5,4; and finally 6,2,3,1,5, 4. 


6.7 Quantum Binomial Coefficients 


The formula [n]!,, = []j_,[t]x for the quantum factorial is analogous to the formula n! = 
TLL, 2 for the ordinary factorial. We can extend this analogy to binomial coefficients and 
multinomial coefficients. This leads to the following definitions. 


6.31. Definition: Quantum Binomial Coefficients. Suppose n > 0,0<k <n, and x 
is any variable. We define the quantum binomial coefficients by the formula 


i] — [nle (a — 1)(x"-1 —1)---(e -1) 


k], [Rlein— al, (@* —1)(@*-T — 1)--- (@ — 1)(e"-# — 1)(ee-F-T — 1)--- (@ — 1) 
6.32. Definition: Quantum Multinomial Coefficients. Suppose n1,...,nz > 0 and x 
is any variable. We define the quantum multinomial coefficients by the formula 

le + | = tne tielle (GR TES Tae? 1) 1) 
MayeyMe J, [raltelma]te---[re]le TT, [(er — 12-2 -1)---(@- 1) 


One cannot immediately see from the defining formulas that the quantum binomial and 
multinomial coefficients are actually polynomials in x (as opposed to quotients of polyno- 
mials). However, we will soon prove that these entities are polynomials with nonnegative 
integer coefficients. We will also give several combinatorial interpretations for quantum bi- 
nomial and multinomial coefficients in terms of suitable weighted sets of objects. Before 
doing so, we need to develop a few more tools. 


It is immediate from the definitions that ls = [ ue 


n—k 
tum binomial coefficients are special cases of quantum multinomial coefficients. We will usu- 
ally prefer to use multinomial coefficients, writing ele in preference to aes or bara F 
because in most combinatorial applications the parameters a and b are more natural than 
aanda+b. 

Before entering into further combinatorial discussions, we pause to give an algebraic 
proof of two fundamental recursions satisfied by the quantum binomial coefficients. These 


recursions are both “quantum analogues” of the binomial coefficient recursion 2.25. 


ie = lee Selc in particular, quan- 


6.33. Theorem: Recursions for Quantum Binomial Coefficients. For all a,b > 0, 
we have the recursion 


at b — »pfatb—-1 4. a+b—-1 

a,b}, eat a a,b—1 |, 
— jat+b-1 a|at+6-1 
~— a—1,b Pe Qo |. 


The initial conditions are lal = [ lL. ke 
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Proof. We prove the first equality, leaving the second one as an exercise. Writing out the 
definitions, the right side of the first recursion is 


| ea _ wv la+b—-l]l, | [a+b—- Il, 
a= 1,0: |. a,b—1 os [a — 1]! a [B]! 0 fa}! [b — le 


Multiply the first fraction by [a],/[a]z and the second fraction by [b],/[b], to create a 
common denominator. Bringing out common factors, we obtain 


( [a+b—1}!, 
[a]! [Bb]! 


By definition of quantum integers, 


) (2° [ale + [b]2)- 


[Jo +2°fal, = (L+ata?+---+a0?')+a°(l+et---+a% 1) 
(Les aPt) +(e? +. +91) = [at Oe. 


Putting this into the previous formula, we get 


[a+b—I],fa+b]e  fatb 
[a]! [B]!2 > | a,b |e 


The initial conditions follow immediately from the definitions. O 


6.34. Corollary: Polynomiality of Quantum Binomial Coefficients. For all n > 0 
andO0<k<n, [7] , 18 a polynomial in x with nonnegative integer coefficients. 


Proof. Use induction on n > 0, the base case n = 0 being evident. Assume n > 0 and that 

[y"] is already known to be a polynomial with coefficients in N for 0 < 7 < n—1. Then, 
x 

by the recursion derived above (taking n = a+b and k=a,sob=n-—k), 


filme lec} +(e 


Thanks to the induction hypothesis, we know that the right side is a polynomial with 
coefficients in N. This completes the induction argument. O 


We need one more result before describing the combinatorial interpretations for the 
quantum binomial coefficients. This result can be viewed as a generalization of the weighted 
bijection rule. 


6.35. Theorem: Weight-Shifting Rule. Suppose (5, w1) and (T, w2) are finite weighted 
sets, and f : S — T is a bijection such that, for some constant b, wi(z) = we(f(z)) +b 
holds for all z € S. Then 

Gs.w:(2) = 2°O7 yw. (2). 


Proof. Let {x} be a one-point set with wt(«x) = b. The generating function for this weighted 
set is «°. There is a bijection i : T — T x {x} given by t + (t,x) for t € T. Observe that 
iof:S—>T x {x} is a weight-preserving bijection, since 


wi(z) = wa(f(z)) +b = we(f(z)) + wt(*) = wt((F(z), *)) = wt((F(z))) (2 € S$). 
Therefore, by the bijection rule and product rule for weighted sets, 


Gs(x) = Grx fx} (x) = Gr(x)G yx} (x) = 2°Gr(x). O 
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6.36. Theorem: Combinatorial Interpretations of Quantum Binomial Coeffi- 
cients. Fix integers a,b > 0. Let L(a,b) be the set of all lattice paths from (0,0) to 
(a,b). Let P(a,b) be the set of integer partitions y» with largest part at most a and with at 
most b parts. Then 


| = oa ginv(w) = oe grrea(m) _ S- aren’ (™) = S- ltl ; 
? x ) 


wER (01%) mwEL(a,b nE€L (a,b) we P(a,b) 


Proof. In 6.19, we constructed weight-preserving bijections between the three weighted sets 
(R(071°), inv), (L(a,b), area), and (L(a, b), area’). Furthermore, Figure 2.18 shows that the 
weighted set (P(a, b),|-|) is essentially identical to the weighted set (L(a,b), area’). So all of 
the combinatorial summations in the theorem are equal by the weighted bijection rule. We 


must prove that these all equal aes e We give two proofs illustrating different techniques. 


First Proof. For each a,b > 0, let g(a, b) = ache) garea(™) — Gr(a,b),area(#). Suppose 
we can show that this function satisfies the same recursion and initial conditions as the 
quantum binomial coefficients do, namely 

g(a, b) = x’g(a—1,b)+g(a,b—1); g(a,0) = g(0,b) = 1. 
Then a routine induction on a + 6 will prove that eae = g(a,b) for all a,b > 0. 

To check the initial conditions, note that there is only one lattice path from (0,0) to 
(a,0), and the area underneath this path is zero. So g(a,0) = x° = 1. Similarly, g(0,b) = 1. 
Now let us prove the recursion for g(a, b), assuming a,b > 0. The set L(a,b) is the disjoint 
union of sets Ly and L2, where L, consists of all paths from (0,0) to (a,b) ending in an east 
step, and Lz consists of all paths from (0,0) to (a, b) ending in a north step. See Figure 6.2. 
Deleting the final north step from a path in L2 defines a bijection from Lz to L(a, b—1), which 
is weight-preserving since the area below the path is not affected by the deletion of the north 
step. It follows that Gr, area(@) = Gr(a,b—1),area(©) = g(a, b—1). On the other hand, deleting 
the final east step from a path in L, defines a bijection from L, to L(a— 1,6) that is not 
weight-preserving. The reason is that the b area cells below the final east step in a path in D1 
no longer contribute to the area of the path in L(a—1, b). However, since the area drops by b 
for all objects in L1, we can conclude that Gr, jarea(@) = GS a4) area) = 2°g(a—1,b). 
Now, by the sum rule for weighted sets, 


g(a, b) = Gr(a,b),area(Z) =Gr, (x) +Gr, (x) = xg(a —1, b) alr g(a, b— 1; 


We remark that a similar argument involving deletion of the initial step of a path in L(a, b) 
establishes the dual recursion g(a, b) = g(a — 1,6) + x%g(a,b— 1). 
Second Proof. Here we will prove that 


[a+2]!> = [al!o[b2 Sat), 


wER (021°) 


which is equivalent to the first equality in the theorem statement. We know from 6.26 that 
the left side here is the generating function for the set S,+4, of permutations of {1,2,...,a+ 
b}, weighted by inversions. By the product rule, the right side is the generating function for 
the product set Sq x 5, x R(0°1°), with the usual product weight (wt(u,v,w) = inv(u) + 
inv(v) + inv(w)). Therefore, it suffices to define a bijection 


f: Sa X Sp x R(071") > Soro 


such that inv(f(u,v,w)) = inv(u) + inv(v) + inv(w) for u € Sa, v € 9, and w € R(0°1°). 
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delete last step 
b ee b 
delete last step 
b ed b-1 
FIGURE 6.2 
Deleting the final step of a lattice path. 
Given (u,v, w) in the domain of f, note that u is a permutation of the a letters 1,2,...,a. 


Replace the a zeroes in w with these a letters, in the same order that they occur in u. Next, 
add a to each of the letters in the permutation v, and then replace the b ones in w by these 
new letters in the same order that they occur in v. The resulting object z is evidently a 
permutation of {1,2,...,a+b}. For example, if a = 3 and b = 5, then 


f (132, 24531, 01100111) = 15732864. 


Since a and 0 are fixed and known, we can reverse the action of f. Starting with a permu- 
tation z of a+b elements, we first recover the word w € R(071°) by replacing the numbers 


1,2,...,a@in z by zeroes and replacing the numbers a+ 1,...,a+06 in z by ones. Next, we 
take the subword of z consisting of the numbers 1, 2,...,a to recover u. Similarly, let v’ be 
the subword of z consisting of the numbers a+ 1,...,a+ 0. We recover v by subtracting 


a from each of these numbers. This algorithm defines a two-sided inverse map to f. For 
example, still taking a = 3 and b = 5, we have 


f-}(35162847) = (312, 23514, 01010111). 


All that remains is to check that f is weight-preserving. Fix u,v, w, z with z = f(u,v, w). Let 
A be the set of positions in z occupied by letters in u, and let B be the remaining positions 
(occupied by shifted letters of v). Equivalently, by definition of f, A = {7 : w; = 0} and 
B= {i : w; = 1}. The inversions of z can be classified into three kinds. First, there are 
inversions (7,7) such that i, 7 € A. These inversions correspond bijectively to the inversions 
of u. Second, there are inversions (i,j) such that 7,7 € B. These inversions correspond 
bijectively to the inversions of v. Third, there are inversions (7,7) such that i € A and 
j € B, ori € B and j € A. The first case (i € A,j € B) cannot occur, because every 
position in A is filled with a lower number than every position in B. The second case 
(i € B,j € A) occurs iff i < j and w; = 1 and w; = 0. This means that the inversions of the 
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third kind in z correspond bijectively to the inversions of the binary word w. Conclusion: 
inv(z) = inv(u) + inv(v) + inv(w), as desired. O 


Like ordinary binomial coefficients, the quantum binomial coefficients appear in a 
plethora of identities, many of which have combinatorial proofs. Here is a typical exam- 
ple, which is a weighted analogue of the Chu- Vandermonde identity in 2.21. 


6.37. Theorem: Quantum Chu-Vandermonde Identity. For all integers a,b,c > 0, 
a+b+ctl = So alte) k+b] [a-k+e 
a,b+e+1 |, = kb |, [ a-k,e | 


Proof. Recall the picture we used to prove the original version of the identity (Figure 2.3), 
which is reprinted here as Figure 6.3. The path dissection in this picture defines a bijection 


f:L(a,b+e+1)—> LJ £8) x L(a—k,c). 
k=0 


Here, k is the x-coordinate where the given path in D(a, b+c+1) crosses the line y = b+(1/2). 
The bijection f is not weight-preserving. However, if a path P € L(a,b+c¢+ 1) maps to 
(Q, R) € L(k,b) x L(a—k,c) under f, then it is evident from the picture that 


area(P) = area(Q) + area(R) + (b+ 1)(a—k). 


(The extra factor comes from the lower-right rectangle of width a — k and height b+ 1.) It 
now follows from the weight-shifting rule, the sum rule, and the product rule that 


Graber acca) = yy ger UG ay acest) : Gig ren D): 
k=0 


We complete the proof by using 6.36 to replace each area generating function here by a 
suitable quantum binomial coefficient. O 


6.38. Remark. There are also linear-algebraic interpretations of the quantum binomial 
coefficients. Specifically, let F be a finite field with q elements, where q is necessarily a 
prime power. Then the integer HP (which is Ee evaluated at x = q) is the number of 
k-dimensional subspaces of the n-dimensional vector space F”. We prove this fact in $12.7. 


6.8 Quantum Multinomial Coefficients 


Recall from 2.27 that the ordinary multinomial coefficients C(n;n1,...,ns) (where n = 
>o,, Nk) satisfy the recursion 


C(n;n1,..., 2k) =) °C(n-1;m,...,me—1,...,M). 
k=1 


The quantum multinomial coefficients satisfy the following analogous recursion. 
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(a,b+c+1) 


(k,b+1) 


seca Wace OR Oe a ay y=b+1/2 
se. we fh  (k,b) 
(0,0) x= 
FIGURE 6.3 
Picture used to prove the g-Chu- Vandermonde identity. 
6.39. Theorem: Recursions for Quantum Multinomial Coefficients. Let n,...,n5 
be nonnegative integers, and set n = }>7_, nx. Then 
: 1 
n = niatnete+Np—-1 a 
= x 
ee 2, M1,.+.,Me —1,...,Ms |, 


(If nz = 0, the kth summand on the right side is zero.) The initial condition is | =, 


0 
ole 
Moreover, le oe | is a polynomial in x with coefficients in N. 

eatiale 


Proof. Neither side of the claimed recursion changes if we drop all n;’s that are equal to zero; 
so, without loss of generality, assume every n; is positive. We can create a common factor 
of [n — 1]!,/ [jai lr]! on the right side by multiplying the kth summand by [nxJa/[Nx]a, 
for 1 <k<-s. Pulling out this common factor, we are left with 


s 


s 
Set [neJo = i gritetne1(] +o4 Pe hs hate grety, 
k=1 k=1 


The kth summand consists of the sum of consecutive powers of x starting at gt F"e-1 
and ending at x™+"*+"*~!, Chaining these together, we see that the sum evaluates to 
eo tat+-+-+a"-! = [n],. Multiplying by the common factor mentioned above, we obtain 


Talal 7 ale 


as desired. The initial condition is immediate. Finally, we deduce polynomiality of the 
quantum multinomial coefficients using induction on n and the recursion just proved, as in 
the proof of 6.34. O 


6.40. Theorem: Quantum Multinomial Coefficients and Inversions of Words. 
Suppose A = {a1 < a2 < --: < as} is a totally ordered alphabet. For all integers 
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Niven Ne 0, 
Nyt + Ns = inv(w) 
Lesee i De , 
Proof. For all integers n1,...,ns5, define 


g(n1,..-,Ns) = > gue), 


wER(a}t--ats) 


(This is zero by convention if any n; is negative.) By induction on )7, nx, it suffices to show 
that g satisfies the recursion in 6.39. Now g(0,0,...,0) = 2° = 1, so the initial condition is 
correct. Next, fix n1,...,ns5 > 0, and let W be the set of words appearing in the definition 
of g(n1,...,s). Write W as the disjoint union of sets Wi,...,W;, where W;, consists of 
the words in W with first letter a,. By the sum rule, 


g(r, --.,s) = Gw(x) = 5° Gy, (2). 


Fix a value of & in the range 1 < k < s such that W; is nonempty. Erasing the first letter 


of a word w in W, defines a bijection from W; to the set R(a7'---ait*~!---a™*). The 
generating function for the latter set is g(mi,...,n~ —1,...,ns). The bijection in question 


does not preserve weights, because inversions involving the first letter of w € Wz disappear 
when this letter is erased. However, no matter what word w we pick in Wz, the number of 
inversions that involve the first letter in w will always be the same. Specifically, this first 
letter (namely a;) will cause inversions with all of the a1’s, a2’s, ..., and ax—1’s that follow 
it in w. The number of such letters is ny +---+nx~1. Therefore, by the weight-shifting rule, 

Gw,inv(®) = 2 g(m1,.-.,M% —1,...,Ms). 


This equation is also correct if W, = @ (which occurs iff nz, = 0). Using these results in the 
formula above, we conclude that 


s 
g(m1,.-.,%s) = S- Gr POL ic ec tI lye rustle )s 
k=1 


which is precisely the recursion occurring in 6.39. O 


6.41. Remark. This theorem can also be proved by generalizing the second proof of 6.36. 
Specifically, one can prove that 


[rr +--+ role = [alles [rele = Sait 
WER(171---s"s) 


by defining a weight-preserving bijection 
fi Snypertne 2 Siny X01 X Sn, X RU" - ++ 8") 


(where S,,, is the set of all permutations of {1,2,...,n;}, and all sets in the Cartesian 
product are weighted by inversions). We leave the details as an exercise. 
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6.9 Foata’s Map 


We know from 6.26 and 6.29 that >?) ,<9, inv’) = [n]!n = yes: amai(v) where S;, is the 
set of permutations of {1,2,...,n}. We can express this result by saying that the statistics 
inv and maj are equidistributed on S;,. We have just derived a formula for the distribution 
of inv on more general sets of words, namely 


S- ao ese | 
WER(1T1--8"s ) N1,-++5Ns i. 


Could it be true that inv and maj are still equidistributed on these more general sets? 
MacMahon [90] proved that this is indeed the case. We present a combinatorial proof of 
this result based on a bijection due to Dominique Foata. For each set S = R(1"!--+s8"s), 
our goal is to define a weight-preserving bijection f : (S,maj) > (S, inv). 

To achieve our goal, let W be the set of all words in the alphabet {1,2,...,5}. We shall 
define a function g : W — W with the following properties: (a) g is a bijection; (b) for 
all w € W, w and g(w) are anagrams (§1.9); (c) if w is not the empty word, then w and 
g(w) have the same last letter; (d) for all w € W, inv(g(w)) = maj(w). We can then obtain 
the desired weight-preserving bijections f by restricting g to the various anagram classes 
Rm +++ 8"), 

We will define g by recursion on the length of w € W. If this length is 0 or 1, set 
g(w) = w. Then conditions (b), (c), and (d) hold in this case. Now suppose w has length 
n > 2. Write w = w’yz, where w’ € W and y, z are the last two letters of w. We can assume 
by induction that wu = g(w’y) has already been defined, and that u is an anagram of w’y 
ending in y such that inv(u) = maj(w’y). We will define g(w) = h.(u)z, where h, : W —- W 
is a certain map (to be described momentarily) that satisfies conditions (a) and (b) above. 
No matter what the details of the definition of h,, it is already evident that g will satisfy 
conditions (b) and (c) for words of length n. 

To motivate the definition of h,, we first give a lemma that analyzes the effect on inv 
and maj of appending a letter to the end of a word. The lemma will use the following 
convenient notation. If u is any word and z is any letter, let n<,(w) be the number of letters 
in u (counting repetitions) that are < z; define n<z(u), ns-z(u), and n>-(u) similarly. 


6.42. Lemma. Suppose uw is a word of length m with last letter y, and z is any letter. 
(a) If y < z, then maj(uz) = maj(u). (b) If y > z, then maj(uz) = maj(w) + m. 
(c) inv(uz) = inv(w) + nsz(u). 


Proof. All statements follow routinely from the definitions of inv and maj. O 


Let us now describe the map h, : W — W. First, h, sends the empty word to itself. 
Now suppose wu is a nonempty word ending in y. There are two cases. Case 1: y < z. In this 
case, we break the word wu into runs of consecutive letters such that the last letter in each 
run is < z, while all preceding letters in the run are > z. For example, if u = 1842434453552 
and z = 3, then the decomposition of u into runs is 


u = 1/3/4,2/4,3/4, 4, 5,3/5,5,2/ 


where we use slashes to delimit consecutive runs. Now, h, operates on u by cyclically shifting 
the letters in each run one step to the right. Continuing the preceding example, 


h3(u) = 1/3/2,4/3, 4/3, 4,4, 5/2, 5,5/. 
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What effect does this process have on inv(u)? The last element in each run (which is < z) is 
strictly less than all elements before it in its run (which are > z). So, moving the last element 
to the front of its run causes the inversion number to drop by the number of elements > z 
in the run. Adding up these changes over all the runs, we see that 


inv(hz(u)) = inv(u) — nsz(u) (6.1) 


in case 1. Furthermore, note that the first letter of h,(u) is always < z in this case. 

Case 2: y > z. Again we break the word u into runs, but here the last letter of each 
run must be > z, while all preceding letters in the run are < z. For example, if z = 3 and 
u = 134243445355, we decompose wu as 


u = 1,3,4/2, 4/3, 4/4/5/3,5/5/. 
As before, we cyclically shift the letters in each run one step right, which gives 
hs(u) = 4, 1,3/4,2/4,3/4/5/5, 3/5/ 


in our example. This time, the last element in each run is > z and is strictly greater than 
the elements < z that precede it in its run. So, the cyclic shift of each run will increase the 
inversion count by the number of elements < z in the run. Adding over all runs, we see that 


inv(hz(u)) = inv(u) + n<z(u) (6.2) 


in case 2. Furthermore, note that the first letter of h,(u) is always > z in this case. 

In both cases, h,(u) is an anagram of u. Moreover, we can invert the action of h, as 
follows. Examination of the first letter of h,(u) tells us whether we were in case 1 or case 
2 above. To invert in case 1, break the word into runs whose first letter is < z and whose 
other letters are > z, and cyclically shift each run one step left. To invert in case 2, break 
the word into runs whose first letter is > z and whose other letters are < z, and cyclically 
shift each run one step left. We now see that h, is a bijection. For example, to compute 
hg (1342434453552), first write 


1/3, 4/2, 4/3, 4, 4,5/3,5,5/2/ 
and then cyclically shift to get the answer 1/4,3/4, 2/4, 4,5,3/5,5,3/2/. 

Now we can return to the discussion of g. Recall that we have set g(w) = g(w’yz) = 
hz(u)z, where u = g(w’y) is an anagram of w’y ending in y and satisfying inv(u) = maj(w’y). 
To check condition (d) for this w, we must show that inv(h,(u)z) = maj(w). Again consider 
two cases. If y < z, then 

maj(w) = maj(w’ yz) = maj(w’y) = inv(u). 
On the other hand, by the lemma and (6.1), we have 
inv(h,(u)z) = inv(hz(u)) + nsz(hz(u)) = inv(w) — nsz(u) + nsz(u) = inv(w) 


(observe that ns-(hz(u)) = ns-(u) since h,(u) and u are anagrams). In the second case, 
where y > z, we have 


maj(w) = maj(w’ yz) = maj(w’y) +n —1=inv(u) +n—-1. 
On the other hand, the lemma and (6.2) give 


inv(h,(u)z) = inv(hz(u)) + ns.(hz(u)) = inv(u) + n<z(u) + ns2(u) = inv(u)+n-1, 
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FIGURE 6.4 
Computation of g(w). 
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FIGURE 6.5 
Computation of g~!(w). 
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current word: 
2,1,3,3,1,3,2,2 
2,3,3,1,3,1,2,2 


since u has n — 1 letters, each of which is either < z or > z. 


It remains to prove that g is a bijection, by describing the two-sided inverse map g 
This is the identity map on words of length at most 1. To compute g~!(uz 
(u). Then return the answer g~1(uz) = 


y= hs? 
tion of the maps g and g 
in the preceding definitions. 


-1 


To compute g(wiw--- 
letters of the current word. 


To compute g~'(z1z9°°- 


Ap, PAOL Y= 2 2 25 


Zn): fori =n,n—1,... 


-1 


), first compute 


(g~!(u’))z. Here is a nonrecursive descrip- 
, obtained by “unrolling” the recursive applications of g and g~ 


1 


n in this order, apply hy», to the first 7-1 


,2 in this order, let zi be the ith 


letter of the current word, and apply hoe to the first 2 — 1 letters of the current 


word. 


6.43. Example. Figure 6.4 illustrates the computation of g(w) for w = 21331322. We 
find that g(w) = 23131322. Observe that maj(w) = 1+4+4+6 = 11 = inv(g(w)). Next, 
Figure 6.5 illustrates the calculation of g~'(w). We have g~!(w) = 33213122 and inv(w) = 


10 = maj(g"*(w)). 


We summarize the results of this section in the following theorem. 


6.44. Theorem. For all n;,..., 


> pial) = 


WER(1"1---s"s) 


Ns = 0, 


s”s) 


inv(w nye: 
ae 


wEeR(1"1--- 


-+ Ns 
-5Ms a 
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(0,0) 


FIGURE 6.6 
First-return analysis for weighted Dyck paths. 


More precisely, there is a bijection on R(1™ ---s”"*) sending maj to inv and preserving the 
last letter of each word. 


DT 


6.10 Quantum Catalan Numbers 


In this section, we investigate two weighted analogues of the Catalan numbers. Recall that 
the Catalan number C,, = 4 (7") a Ga - (ea counts the collection of all lattice 
paths from (0,0) to (n,n) that never go below the line y = x (§1.10). Let D, be the 
collection of these Dyck paths. Also, let W,, be the set of words that encode the Dyck paths, 
where we use 0 to encode a north step and 1 to encode an east step. Elements of W,, will 


be called Dyck words. 


6.45. Definition: Statistics on Dyck Paths. For every Dyck path P € Dy, let area(P) 
be the number of complete unit squares located between P and the line y = «a. If P is 
encoded by the Dyck word w € W,, let inv(P) = inv(w) and maj(P) = maj(w). 


For example, the path P shown in Figure 6.6 has area(P) = 23. One sees that inv(P) 
is the number of unit squares in the region bounded by P, the y-axis, and the line y = n. 
We also have inv(P) + area(P) = (3) since (5) is the total number of area squares in the 
bounding triangle. The statistic maj(P) is the sum of the number of steps in the path that 
precede each “left-turn” where an east step (1) is immediately followed by a north step (0). 
For the path in Figure 6.6, we have inv(P) = 97 and maj(P) = 4+6+10+16+18+ 22+ 
24 + 28 = 128. 
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6.46. Example. When n = 3, examination of Figure 1.8 shows that 


Gps, area(Z) 14+2¢+27+2°; 
Gpsinv(v) = L+at+ Qn? + x; 


G'p3,maj(Z) l+e?tei tats o®. 


I 


I 


When n = 4, a longer calculation gives 


Gp, area(t) = 1+32+32? 432° 42274 2° 4 2°; 
Gpymaj(t) = get ae a oes Ae ee 


There is no particularly nice closed formula for Gp, jarea(x) (although determinant for- 
mulas do exist for this polynomial). However, these generating functions do satisfy a re- 
cursion, which is the analogue of the “first-return” recursion used in the unweighted case 


(§2.7). 


6.47. Theorem: Recursion for Dyck Paths Weighted by Area. For all n > 0, set 
Ch(x) = Gp, area(x). Then Co(x) = 1 and, for all n > 1, 


Cn) = a 1p i (2) Cy (0): 
k=1 


Proof. We imitate the proof of 2.33, but now we must take weights into account. For 1 < 
k <n, write Dy,x for the set of Dyck paths of order n whose first return to the line y = x 
occurs at (k,k). Evidently, Dy, is the disjoint union of the D,,,,’s, so the sum rule gives 


Cr, (x) = >» Gp,,,,,area(Z). 
k=1 


For fixed k, we have a bijection from D,,,, to Dx_-1 x Dn», defined by sending P = 0, P;,1, P2 
to (P,, P2), where the displayed 1 encodes the east step that arrives at (k,k). See Figure 6.6. 
Examination of the figure shows that 


area(P) = area(P,) + area(P2) + (k — 1), 


where the k—1 counts the shaded cells in the figure which are not included in the calculation 
of area(P,). By the product rule and weight-shifting rule, we see that 


Gp,,,,,area() = Cr-1 (2)Cn—k (x)a*—1 : 
Inserting this into the previous formula gives the recursion. O 


Now let us consider the generating function Gp,, maj(z). This polynomial does have a 
nice closed formula, as we see in the next theorem. 


6.48. Theorem: Dyck Paths Weighted by Major Index. For all n > 0, 


2 oe Qn ig an _ 1 2n 
Dn, maj rt) = n,n ‘ n—-1,n+1 eo [n+ 1a n,m _ 


Proof. The second equality follows from the manipulation 


[me], [n-amsa].~[n],O- pete) 
-j fp (eee ane if 


n,n [n+ lz 


9 
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The other equality in the theorem statement can be rewritten 


2n 2n 
[na tnga], + exiled = [eh 


We will give a bijective proof of this result reminiscent of André’s reflection principle 
(see 1.56). Consider the set of words S = R(0"1"), weighted by major index. By 6.44, the 
generating function for this set is sal . On the other hand, we can write S' as the disjoint 
union of W,, and T’, where W,, is the set of Dyck words and T consists of all other words in 
R(0"1"). We will define a bijection g : T + R(0"*11"—') such that maj(w) = 1+maj(g(w)) 
for all w € T. This will give 


2n 


n,n n—-1,n+1 


| a = Gs,maj(®) = Gr,maj(£) + Gw,,maj(x) = x | | + Gp,,,maj(Z), 
zx xz 

as desired. To define g(w) for w € T, regard w as a lattice path in an n x n rectangle by 
interpreting 0’s as north steps and 1’s as east steps. Find the largest k > 0 such that the 
path w touches the line y = x —k. Such a k must exist, because w € T is not a Dyck 
path. Consider the first vertex v on w that touches the line in question. (See Figure 6.7.) 
The path w must arrive at v by taking an east step, and w must leave v by taking a north 
step. These steps correspond to certain adjacent letters w; = 1 and w;+1 = 0 in the word w. 
Furthermore, since v is the first visitation to this line, we must have either 7 = 1 or w;_-1 = 1 
(i.e., the step before w; must be an east step if it exists). Let g(w) be the word obtained by 
changing w,; from 1 to 0. Pictorially, we “tip” the east step arriving at v upwards, changing 
it to a north step (which causes the following steps to shift to the northwest). The word 
w =---1,1,0--- turns into g(w) = ---1,0,0---, so the major index drops by exactly 1 
when we pass from w to g(w). This result also holds if i = 1. The new word g(w) has n— 1 
east steps and n + 1 north steps, so g(w) € R(0"t11"—1). Finally, g is invertible. Given a 
path/word P € R(0"t11"~!), again take the largest k > 0 such that P touches the line 
y = «—k, and let v be the last time P touches this line. Here, v is preceded by an east 
step and followed by two north steps (or v is the origin and is followed by a north step). 
Changing the first north step following v into an east step produces a path g’/(P) € R(0"1”). 
One routinely checks that g/(P) cannot be a Dyck path (so g’ maps into T), and that 9’ 
is the two-sided inverse of g. The key is that the selection rules for v ensure that the same 
step is “tipped” when we apply g followed by g’, and similarly in the other order. 

O 


Summary 


e Generating Functions for Weighted Sets. A weighted set is a pair (S,wt) where S isa 
set and wt : S — N is a function (called a statistic on S'). The generating function for 
this weighted set is Ggwe(x) = Gs(x) = Deg 2). Writing Gg(x) = psp ane", ax 
is the number of objects in S having weight k. 


e Weight-Preserving Bijections. A weight-preserving bijection from (S$, wt) to (JT, wt’) is 
a bijection f : S > T with wt’(f(z)) = wt(z) for all z € S. When such an f exists, 
Gs.wt() = Gr.wt:(x). More generally, if there is b € Z with wt’(f(z)) = 6 + wt(z) for 
all z € S, then Grwt (x) = 2°G 5 wt(x). 


e Sum Rule for Weighted Sets. Suppose (S;,w;) are weighted sets for 1 <i<k, S is the 
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FIGURE 6.7 
The tipping bijection. 


disjoint union of the S;, and we define w: S — N by w(z) = wi(z) for z € S;. Then 
Gs(e) = Di Gs,(2). 

Product Rule for Weighted Sets. Suppose (S;,w;) are weighted sets for 1 < i < k, 
S = S,xS)x---x Sz is the product of the S;, and we define w : S > N by w(21,..., 2%) = 
yo, wi(z) for z% € S;. Then Gs (x) = []k_, Gs, (2). 


e Quantum Integers, Factorials, Binomial Coefficients, and Multinomial Coefficients. Sup- 
pose x is a variable, n,k,nj © N,O< k <n, and >0, ni = n. We define [nr], = Beas C= 
[n]! [nr]! 


(2” —1)/(@—-1), [n}le = TT file, Velie "TAL fa Rl? [es Ky lle ieee These are 


all polynomials in x with coefficients in N. _ 


e Recursions for Quantum Binomial Coefficients, etc. The following recursions hold: 


[n]!2 = [n—-l]!x- [ne 
a+b - ple b=T z at+b—1| — la+6-1 fe ajato-1 
a,b |, = oa a—1,b], a,b—1 ee a—1,b], * a,b—1]., 
Ny ret + Ns _ Satie Mr +Ns —1 
M1,-+-,%s Ja = ea N1,..-,Nk —1,...,Ns}, 


Statistics on Words. Given a word w = w ,w2:::Wpy over a totally ordered alphabet, 
Inv(w) = {(i,j) i < jg and ws > w;}, inv(w) = |Inv(w)|, Des(w) = {i < n: uw; > 
with}, des(w) = | Des(w)|, and maj(w) = eB) i. We have 


ny + «Sta + n ; , 
| ‘| = y ginv(w) = > qmai(w) 
1; gehts x wER(aT!...ats) weR(ay!---ats) 


The second equality follows from a subtle bijection due to Foata, which maps maj to inv 
while preserving the last letter of the word. In particular, letting S$, = R(1'2!---n'), 


inte = SD aie) = SP gmaile, 
weSn wEeESn 


These two formulas can be proved bijectively by mapping w € S, to its “inversion 
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table” (t1,...,tn), where t; records the change in inversions (resp. major index) caused 
by inserting the symbol 7 into the subword of w consisting of 1,2,...,2—1. 

a+b _ fb+a 
a,b | ae ae 
paths in an a x b (or b x a) rectangle, weighted either by area above the path or area 
below the path. This coefficient also counts integer partitions with first part < a and 
length < b, weighted by area. 


Weighted Lattice Paths. The quantum binomial coefficient [ ] counts lattice 
x 


Weighted Dyck Paths. Let C,(a) be the generating function for Dyck paths of order 
n, weighted by area between the path and y = «. Then Co(x) = 1 and C(x) = 
eB 1Ce_-1(@)Cn_z (x). The generating function for Dyck paths (viewed as words 
in R(0"1")) weighted by major index is fate els = eee — | ale Ls 


nin nin n—-1,n+1 


Exercises 


In the exercises below, S,, denotes the set R(12---n) of permutations of {1,2,...,n}, unless 
otherwise specified. 


6.49. Let S = R(a'b'c?), T = R(a'b?c!), and U = R(ab'c!). Confirm that Gg iny(z) = 
Grinv(@) = Gujinv(£) (as asserted in 6.15) by listing all weighted objects in T and U. 


6.50. (a) Compute inv(w), des(w), and maj(w) for each w € S4. (b) Use (a) to find the gen- 
erating functions Gg, inv(%), Gsydes(%), and Gg, maj(x). (c) Compute [4]!, by polynomial 
multiplication, and compare to the answers in (b). 


6.51. (a) Compute inv(w) for the following words w: 4251673, 101101110001, 314423313, 
55233514425331. (b) Compute Des(w), des(w), and maj(w) for each word w in (a). 


6.52. Confirm the formulas for Gp, area(x) and Gp, maj(x) stated in 6.46 by listing weighted 
Dyck paths of order 4. 


6.53. (a) Find the maximum value of inv(w), des(w), and maj(w) as w ranges over S',. (b) 
Repeat (a) for w ranging over R(1"!2"2 ---s”5), 


6.54. Let S be the set of k-letter words over the alphabet n. For w € S, let wt(w) be the 
sum of all letters in w. Compute Gg wt(z). 


6.55. Let S be the set of 5-letter words using the 26-letter English alphabet. For w € S, 
let wt(w) be the number of vowels in w. Compute G's we (2). 


6.56. Let S be the set of all subsets of {1,2,...,n}. For A € S, let wt(A) = |A|. Use the 
product rule for weighted sets to compute Gg.we(x) (cf. 1.20). 


6.57. Let S be the set of all k-element multisets using the alphabet n. For M € S, let 
wt(M) be the sum of the elements in M, counting multiplicities. Express Gs,wt(x) in terms 
of quantum binomial coefficients. 


6.58. (a) How many permutations of {1,2,...,8} have exactly 17 inversions? (b) How many 
permutations of {1,2,...,9} have major index 29? 


6.59. (a) How many lattice paths from (0,0) to (8,6) have area 21? (b) How many words 
in R(0°1°2") have ten inversions? (c) How many Dyck paths of order 7 have major index 
30? 
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6.60. Quantum Binomial Theorem. Let S be the set of all subsets of {1,2,...,n}. For 
A€S, let wt(A) be the sum of the elements in A. Show that 


n 


[[@+2)= Gs,wt(t = yoann a 


i=l x 
6.61. Use an involution to prove )yp_9(—1)*ak*-D/2 [2] = 0 for all n > 0. 


6.62. Compute each of the following polynomials by any method, expressing the answer in 
the form S59 ax: (a) [7]; (b) [6}!e5 (¢) [§).3 @ b3al,) © gelssl, © mele, 


6.63. (a) Factor the polynomials [4],, [5], [6]2, and [12], in Z[a]. (b) How do these poly- 
nomials factor in C[a]? 

6.64. Compute Ae in six ways, by: (a) simplifying the defining formula in 6.31; (b) using 
the first recursion in 6.33; (c) using the second recursion in 6.33; (d) enumerating words in 
R(0011) by inversions; (e) enumerating words in R(0011) by major index; (f) enumerating 
partitions contained in a 2 x 2 box by area. 


: lc “+k 
1 4,41 +s + Ne x 
(b) Give a combinatorial proof of the ‘dente in (a). 


algebraically. 


6.65. (a) Prove the identity jew akan Pa 


y+ rs va 


6.66. For 1 <i <3, let (Tj, w;) be a set of weighted objects. (a) Prove that idr, : T; — Ty is 
a weight-preserving bijection. (b) Prove that if f : T, — T> is a weight-preserving bijection, 
then f~! : T2 — T, is weight-preserving. (c) Prove that if f : T, — Tz and g : Tz — T3 are 
weight-preserving bijections, so is go f. 


6.67. Prove the second recursion in 6.33: (a) by an algebraic manipulation; (b) by removing 
the first step from a lattice path in an a x b rectangle. 


6.68. Let f be the map in the second proof of 6.36, with a = b = 4. Compute: (a) 
f (2413, 1423, 10011010); (b) (4321, 4321, 11110000); (c) f(2134,3214,01010101). Verify 
that weights are preserved in each case. 


6.69. Let f be the map in the second proof of 6.36, with a = 5 and 6 = 4. For each w 
given here, compute f~'(w) and verify that weights are preserved: (a) w = 123456789; (b) 
w = 371945826; (c) w = 987456321. 


6.70. Repeat 6.69 assuming a = 2 and b = 7. 


6.71. Prove that [ny +--+ 4+ ms]lo = [nmi]la--: ele weR(I"- gna) © zinv() by defining a 
weight-preserving bijection f : Sin,4..-4n, 2 Sny X+°° X Sn, X R(1™ +++ 8"). 


6.72. (a) Find and prove an analogue of the identity 77, (") = (°”) involving quantum 


binomial coefficients (cf. 2.19 and Figure 2.1). (b) Similarly, derive a quantum analogue of 


the identity 775 ae = (ey: 


6.73. Let S be the set of two-element subsets of Deck. For H € S, let wt(H) be the sum 
of the values of the two cards in H, where aces count as 11 and jacks, queens, and kings 
count as 10. Find Gg we (x). 


6.74. Define the weight of a five-card poker hand to be the number of face cards in the hand 
(the face cards are aces, jacks, queens, and kings). Compute the generating functions for the 
following sets of poker hands relative to this weight: (a) full house hands; (b) three-of-a-kind 
hands; (c) flush hands; (d) straight hands. 


Counting Weighted Objects 239 


6.75. Define the weight of a five-card poker hand to be the number of diamond cards in 
the hand. Compute the generating functions for the following sets of poker hands relative 
to this weight: (a) full house hands; (b) three-of-a-kind hands; (c) flush hands; (d) straight 
hands. 


6.76. Let T, be the set of connected simple graphs with vertex set {1,2,...,n}. Let the 
weight of a graph in T,, be the number of edges. Compute Gr, (x) for 1 <n <5. 


6.77. Let f, and g, be the maps in 6.26. Compute f¢(341265) and gg6(0,0,1,3, 2,3), and 
verify that weights are preserved for these two objects. 


6.78. Let f, and g, be the maps in 6.26. Compute fg(35261784) and gg(0, 1, 0,3, 2, 4,6, 5), 
and verify that weights are preserved for these two objects. 


6.79. In 6.26, we constructed an inversion table for w € S,, by classifying inversions (i, 7) € 
Inv(w) based on the left-hand value w;. Define a new map f : S, > 1x 2x--- xn by 
classifying inversions (i,7) € Inv(w) based on the right-hand value w;. Show that f is a 
bijection, and compute f (35261784) and f—1(0, 1,0, 3, 2, 4,6, 5). 

6.80. Define a map f:S, > 1x2x---xn by setting f(w) = (tn,...,t1), where t; = |{j : 
(i, 7) € Inv(w)}|. Show that f is a bijection. (Informally, f classifies inversions of w based 
on the left position of the inversion pair.) Compute f (35261784) and f~1(0, 1, 0,3, 2,4, 6,5). 
6.81. Define a map f : S, —~ 1x2x---xn that classifies inversions of w based on the right 
position of the inversion pair (cf. 6.80). Show that f is a bijection, and compute f (35261784) 
and f—1(0,1, 0,3, 2, 4,6, 5). 

6.82. Let f, be the map in 6.29. Compute f¢(341265) and fe (Os0, 1,3, 2,3), and verify 
that weights are preserved for these two objects. 


6.83. Let f, be the map in 6.29. Compute fg(35261784) and fae Cr 1,0,3,2,4,6,5), and 
verify that weights are preserved for these two objects. 


6.84. Coinversions. Define the coinversions of a word w = wywW2-++ Wr, by coinv(w) = 
eax te < wy), Prove that pep inienaey ee = bene (a) by using a 
bijection to reduce to the corresponding result for inv; (b) by verifying a suitable recursion. 


6.85. Given a word w = w1--- Wn, define comaj(w) = 0,2, ix(wi < wi41) and rlmaj(w) = 
Dien(? — I)x(wi > wi41). Calculate Seg, goomai(w) and es, giimal(wy’ 


6.86. For w € S,, let wt(w) be the sum of all 7 < n such that 7+ 1 appears to the left of i 
in w. Compute Gg, wt (x). 


6.87. (a) Suppose w = w1w2-++Wn—1 is a fixed permutation of n — 1 distinct letters. Let 
a be a new letter less than all letters appearing in w. Let S be the set of n words that can 
be obtained from w by inserting a in some gap. Prove that }7.-5 gmail?) = gmai(v) In]... (b) 
Use (a) to obtain another proof that >) <9, eNO) lal 


6.88. Suppose k is fixed in {1,2,...,n}$, and w = wiw2---wWpr_1 is a fixed permutation of 
{1,2,...,k-—1,k+1,...,n}. Let S be the set of n words that can be obtained from w by 
inserting k in some gap. Prove or disprove: ))<g gmailz) = gmail) [np]... 


6.89. Define a cyclic shift function c: {1,2,...,n} — {1,2,...,n} by c(i) =i4+1 fori <n, 
and c(n) = 1. Define a map C': S,, — S, by setting C(wiw2--- wr) = c(w1)c(we) +++ c(wn). 
(a) Prove: for all w € S;,, maj(C(w)) = maj(w)—lifw, #n, and maj(C(w)) = maj(w)+n— 
Lif wp =n. (b) Use (a) to show combinatorially that, for 1 <k <n, S> gman) — 
PO ee «. x™i(%) (c) Use (b), the sum rule, and induction to obtain another proof 
of 6.29. 


WESn: Wn=k 
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6.90. For all n > 1, all T C {1,2,...,n—1}, and 1<k <n, let G(n,T,k) be the number 
of permutations w of {1,2,...,n} with Des(w) = T and w, = k. (a) Find a recursion for 
the quantities G(n, T,k). (b) Count the number of permutations of 10 objects with descent 
set {2,3,5, 7}. 


6.91. Let w be the word 4523351452511332, and let hz be the map from §6.9. Compute 
hz(w) for z = 1,2,3,4,5,6. Verify that (6.1) or (6.2) holds in each case. 


6.92. Let w be the word 4523351452511332, and let h, be the map from §6.9. Compute 
hz *(wy for z= 1,2,3;4,5, 6. 


6.93. Compute the image of each w € S4 under the map g from 86.9. 


6.94. Let g be the map in §6.9. Compute g(w) for each of these words: (a) 4251673; (b) 
27418563; (c) 101101110001; (d) 314423313. Verify that inv(g(w)) = maj(w) in each case. 


6.95. Let g be the map in §6.9. Compute g~!(w) for each word w in 6.94. 


6.96. Let g be the bijection in the proof of 6.48. Compute g(w) for each non-Dyck word 
w € R(0713). 


6.97. Quantum Fibonacci Numbers. (a) Let W,, be the set of words in {0,1}” with no 
two consecutive zeroes, and let the weight of a word be the number of zeroes in it. Find a 
recursion for the generating functions Gy, (x), and use this to compute G'y, (). (b) Repeat 
part (a), taking the weight to be the number of ones in the word. 


6.98. Let S;,~ be the set of non-attacking placements of n — k rooks on the board A,, 
(see 2.63). Define the weight of such a placement as follows. Each rook in the placement 
“cancels” all squares above it in its column. The weight of the placement is the total number 
of uncanceled squares located due west of rooks in the placement. Find a recursion for 
the generating functions Gn, = Gs,,,,,wt(#), which are quantum analogues of the Stirling 
numbers of the second kind. Compute these generating functions for0 <k<n< 5. 


6.99. Let C,,, be the set of permutations of n consisting of k disjoint cycles. Define a 
statistic on permutations w € C,,, so that the associated generating functions satisfy the 
recursion 


Gone (x) = Goi gail2) oe [n VeGe,_s. (x). 


6.100. Let 7), be the set of trees with vertex set n. Can you find a statistic on trees such 
that the associated generating function satisfies Gr, (x) = [n]"~?? 


6.101. Multivariable Generating Functions. Suppose S is a finite set, and w1,...,Wn : 
S — N are n statistics on S. The generating function for S relative to the n weights 
prove versions of the sum rule, product rule, bijection rule, and weight-shifting rule for such 
generating functions. 


6.102. Extend 6.60 to a formula for []j_, (1+ ta") by weighting subsets of {1,2,...,n} by 
the number of elements in the subset and by the sum of the elements in the subset. 


6.103. Recall from 1.29 that we can view permutations w € S, as bijective maps of 
{1,2,...,n} into itself. Define I : S, — S, by I(w) = w7! for w € Sp. (a) Show 
that [oI = idg,. (b) Show that inv([(w)) = inv(w) for all w € S). (c) Define 
imaj(w) = maj(I(w)) for all w € S,. Compute the two-variable generating function 
Gn(z,y) = owes, gmail) yimaj(w) for 1 <n <4. Prove that Gn(x,y) = Gn(y, 2). 


Counting Weighted Objects 241 


6.104. Let g be the map in §6.9, and let IDes(w) = Des(w7!) for w € S;,. (a) Show that 
for all w € S,,, IDes(g(w)) = IDes(w). (b) Construct a bijection h: S, > S,, such that, for 
all w € S,, inv(h(w)) = maj(w) and maj(h(w)) = inv(w). 


6.105. Let P,, be the set of integer partitions whose diagrams fit in the diagram of (n—1,n— 
2,...,2,1,0), ie. u © Ph iff €(u) <n and yw; < n—i for 1 <i <n. Let G(x) =o yep, cH. 
Find a recursion satisfied by G,(x) and use this to calculate G5(x). What is the relation 
between G(x) and the quantum Catalan number C,,(a) from §6.10? 


6.106. Bounce Statistic on Dyck Paths. Given a Dyck path P € Dp, define a 
new weight bounce(P) as follows. A ball starts at (0,0) and moves north and east to 
(n,n) according to the following rules. The ball moves north vp steps until blocked by 
an east step of P, then moves east vg steps to the line y = x. The ball then moves 
north v; steps until blocked by the east step of P starting on the line x = vo, then 
moves east v; steps to the line y = x. This bouncing process continues, generating a se- 
ei (v0, U1,--+5Us) ef vertical moves adding to n. We define bounce(P) = )>}_, iv; and 

Cn(q,t) = do pep, grrealP) ¢bounce(P) | (a) Calculate C,,(q,t) for n < 4 by enumerating Dyck 
paths. (b) Let oe t) = peda: v0(P)=k grrea(P)¢bounce(P) be the generating function for 
Dyck paths that start with exactly k north steps. Establish the recursion 


ae r+k—-1 
Cult >>. 7 ma Peace Cl 
qd 


by “removing the first bounce.” Show also that C,(q,t) = t7"Cn+i,1(¢,t). (c) Use the 
recursion in (b) to calculate C,,(q,t) for n = 5,6. (d) Prove that gh -D2C, (q,1/q) = 
Do Bem: gn@i(P), (e) Can you prove C,(q,t) = Cn(t,q) for all n > 1? 


6.107. Let G,, be the set of sequences g = (go, 91,---;Jn—1) Of nonnegative integers with 
go = 0 and gi41 < gi +1 for allt < n—1 (cf. 2.120). For g € Gy, define area(g) = 
Spake gi and dinv(g) = >7,-; x(gi — 97 € {0,1}). (a) Find a bijection k : G, — Dy, such 
that area(k(g)) = area(g) for all g € Gp. (b) Find a bijection h : G, — Dp such that 
area(h(g)) = dinv(g) and bounce(h(g)) = area(g) for all g € G,, (see 6.106). Conclude that 
the statistics dinv, bounce, area (on G,), and area (on D,,) all have the same distribution. 


Notes 


The idea used to prove 6.29 seems to have first appeared in Gupta [63]. The bijection in §6.9 
is due to Foata [38]. For related material, see Foata and Schiitzenberger [40]. Much of the 
early work on permutation statistics, including proofs of 6.44 and 6.48, is due to Major Percy 
MacMahon [90]. The bijective proof of 6.48, along with other material on quantum Catalan 
numbers, may be found in Fiirlinger and Hofbauer [47]. The bounce statistic in 6.106 was 
introduced by Haglund [64]; for more on this topic, see Haglund [65]. 
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Formal Power Series 


In the last chapter, we introduced techniques for computing generating functions G's(x) = 
ge ga) where S' is a finite set of weighted objects. These generating functions are 
polynomials in the variable x. Now suppose that S is an infinite set of weighted objects. 
By analogy with the finite case, we would like to define a generating function Gg(x) = 
Diego™) = 59 ane”, where a, is the number of objects in S of weight n. But the 
resulting expression G'g(a) is no longer a polynomial in , since a polynomial can have only 
finitely many terms. 


For example, if S' is the set of all words over the alphabet {0,1} weighted by length, we 
have a, = 2” for all n > 0, and so 


G(x) = 14 2x + 4x? + 807 +---+ 2a" $---= S0 M 0”. 


n=0 


If we think of x as a real number, then G(x) is a function of a real variable that is defined 
for all x sufficiently close to zero. In fact, using the geometric series formula from calculus, 
one sees that Gs(x) = >>,,59(2x)” = p4g for -1/2 < x < 1/2. For values of x outside this 
interval, Gg(x) is undefined. More generally, the power series or 9 ane” can be regarded 
as a function of a real (or complex) variable x that is defined within a certain interval of 
convergence centered at « = 0. However, difficulties can emerge if the coefficients a, grow 
too rapidly. For instance, given a, = n! for all n, one can show using the ratio test that 
the power series H(x) = )>**_)n!x" only converges at x = 0. Thus we cannot recover the 
coefficients a, = n! from knowledge of the function H, which is only defined at x = 0. 


As this example shows, using real-valued functions to model combinatorial generating 
functions can be problematic because one must constantly worry about questions of conver- 
gence. We would prefer a purely formal notion of a power series in which convergence issues 
do not arise. The idea is to view a generating function 7, a,x" as merely a shorthand 
for an infinite sequence of integers (do, @1, d2,..-,@n,.--). The letter x is now only a symbol, 
not a variable; we are not allowed to substitute specific real numbers for x. 


This chapter gives a rigorous development of the algebraic properties of formal power 
series. Our goal is to extend the familiar operations on polynomial functions (like addition, 
multiplication, composition, and differentiation) to the setting of formal power series. In 
certain situations, we will even be able to define infinite sums and products of formal power 
series. These algebraic operations will be used to help develop the combinatorics of infinite 
weighted sets, which is the topic of the next chapter. 


In combinatorics, it usually suffices to consider formal power series whose coefficients 
are integers, rational numbers, or complex numbers. In this chapter, we will consider the 
slightly more general situation where the coefficients come from any field of characteristic 
zero (cf. 7.1 below). In fact, much of the algebraic theory is valid for power series with 
coefficients coming from an arbitrary ring (see 2.2). We shall indicate, as we proceed, which 
proofs require the stronger assumptions we are imposing on the coefficient ring. 
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7.1 The Ring of Formal Power Series 


7.1. Notational Convention. Throughout this chapter, the letter K will stand for a field 
(see 2.3) that contains the field Q of rational numbers. 


For example, K might be Q itself, or R (the field of real numbers), or C (the field of 
complex numbers). K might also be a field Q(x) of formal rational functions, discussed 
in 7.46 below. 


7.2. Definition: Formal Power Series. A formal power series in one variable with co- 
efficients in K is a function fF: N— K. We write F(n) or F;, for the value of the function 
F on the input n € N. The set of all such functions will be denoted K'|[x]], where x is a 
symbol called an indeterminate. 


A formal power series F' € K|[z]] is exactly the same as a sequence 
P= PG Pig Fayence See) = (FO) (1) FQ) es Pace) 


indexed by nonnegative integers, where each F,, € K. We often display this sequence using 
power series notation, writing 
Co 
P= gt 
n=0 


and calling F, the coefficient of «” in F. For the time being, the symbol x appearing 
in this notation has no independent meaning, and there are no addition, multiplication, 
or exponentiation operations being performed on the right side. This notation is used to 
help motivate the algebraic operations on power series to be introduced below, which are 
suggested by corresponding operations on one-variable polynomials. 


7.3. Remark: Equality of Formal Power Series. Two formal power series F',G € K|[z]] 
are equal iff F,, = G, for all n € N. This follows from the definition of equality of two 
functions with domain N. 


7.4, Example. Consider the functions G, H : N > Q defined by G(n) = 2” and H(n) = n! 
for all n € N. These are two elements of Q|[{:]] which were discussed in the introduction to 
this chapter. In sequence notation and power series notation, we would write 


Ge (24 Bre See 
n>0 


A = (0y1, 256 ,;24)120, 720/..nla)y= So mle” 
n>0 
The function Z : N > K such that Z(n) = Ox for all n € N defines a zero power series 
Z = > 50 02". We often denote Z (as well as the additive identity of K, the integer zero, 
etc.) by the symbol 0. 


7.5. Example: X;. For each i € N, define a power series X; : N > K by setting X;(i) = 1 
and X;(j) = 0 for all 7 4 i. Thus X; is the sequence (0,0,...,1,0,...) where the 1 is 
preceded by 7 zeroes. We have 


Xj; = a x(n = i)a”. 
n=0 
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If we omit zero coefficients, it is tempting to write Xp = 29 = 1, X, =a! =2, and X; = 2". 


Strictly speaking, these abbreviations of the official power series notation are not allowed, 
but soon we will find a way to justify them. 


We can now define addition and multiplication of formal power series. 


7.6. Definition: Sum and Product of Formal Power Series. Given F,G € K|[z]], 
define the sum F+G:N— K by (F + G)(n) = F(n) + G(n) for all n € N. Define the 
product FG:N— K by 


n 


(FG)(n)= So FG) =S° F(A)G(n— k). 
(i,5) EN? k=0 
itj=n 


FG is sometimes called the convolution of the functions F and G. 


In sequence notation, this definition says 


(F, :n>0)+(Grin>0)= (Fi, + Gyr: n> 0); 


n 
(Fy in >0) x (Gn: n>0)= (SoniGeasn>0), 
k=0 
Using formal power series notation, these operations can also be written 


a Fra” + > Grx” = So (Fn +G,,)x"; 


n>0 n>0 n>0 


So Fx” x SoGye" =~ S- FG; | 2”. 


n>=0 n>0 n>0 \itj=n 


These formulas are exactly what we would expect (using the generalized distributive law) 
if « and every F,, and G,, were elements in some ring, and the sums appearing were finite. 


7.7. Example. In Q|[z]], we have 
(1, 2,3,4,5,6,...) + (1,0,1,0,1,0,1,0,1,0,...) = (2,2,4,4,6,6,...); 
(1,2,3,4,5,6,...) x (1,0,1,0,1,0,1,0,1,0,...) = (1,2, 4,6, 9, 12, 16, 20,...). 
Given A = (3,0,2,1,7,0,0,0,...) and B = (1,4,5,0,0,...) in K[[z]], we have 
A+B=(4,4,7,1,7,0,0,...); 
ABS B15. 2 20s BiG 0 Ae Te S917, 9 198 3500 


Compare these formal operations to the ordinary product of the two polynomial functions 
p(z) =3 + 222 + 23+ 724 and q(z) = 14+ 424 52?: 


p(z) +a(z) =44424+ 72? +12°4+724 (Zz ER); 


p(z)q(z) = 3 +122 + 1727 + 92? + 2124+ 332°-+352° (ze R). 
Now suppose F = (F,,:n € N) € K[[a]] and C = (1,1,1,...) € K[[z]]. Then 
FC =CF=(fo,foth,fFot Mh t+hy,...,Fot Pi t--++Fh,...). 


Thus, multiplication by C replaces a sequence of scalars by the sequence of partial sums of 
those scalars. 
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7.8. Theorem: Algebraic Structure of K|[z]]. 

defined above, K'[[x]] is a commutative ring. (b) K'[[a]] contains the field K, provided we 
identify each a € K with the sequence (a,0,0,0,...) € K[[z]]. (c) K[[a]] is a vector space 
over Kx. 


(a) With the sum and product operations 


Proof. (a) We verify some of the ring axioms for K'[[x]] (see 2.2), leaving the others as 
exercises. If F’ and G are functions from N to K, F'+G and FG are also functions that map 
N into K (using closure of K under addition and multiplication). In other words, F' € K[[:]} 
and G € K[[a]] imply F+G € K|[a]] and FG € K|[[z]], so the closure axioms for K’|{:]] are 
true. To see that addition in K'|[z]] is associative, fix F,G,H € K||x]]. Using associativity 
of addition in the field K, we see that 


[(F+G)+ A), 


I 


(P+ Gja by = Gy t-Ga) + He 
= Fy +(Git An) = Frt(Gt+ A)n =(F + (G+ A)n 


for every n € N. Thus (by 7.3) (F +G)+H=F+(G+H). 
The verification that (F'G)H = F(GH) is somewhat more elaborate. On one hand, for 
a fixed n EN, 


(FQ Hn = >> (FG)iHe= | (EG) | ee: D> Ges. 


(i,c)€N?: (i,c)EN? \ (a,b)EN? (a,b,c)EN? 
ite=n itc=n a+b=i (a+b)+c=n 


The last step used the distributive law in K and a reindexing of the summations (which is 
permissible since addition in K is commutative). On the other hand, 


[F(GH)|, = y Fi(GH), = y Fa y (G)H.) | = y FF. (Gp). 
(a,k)EN?: (a,k) EN? (b,c)EN? (a,b,c)€N3 
atk=n atk=n b+c=k at+(b+c)=n 


Using associativity of addition in N and associativity of multiplication in K, we see that 
((FG)H]|n = [F(GH)], for all n € N, hence (FG)H = F(GH) as desired. 

Next we claim that Xo = (1,0,0,...) = 0,5, x(n = 0)a” is the multiplicative identity 
element in K'[[z]]. For, given F € K[[z]] and n € N, we compute 


(XoF n= > Xo(i)F(j) = 1F(n) + OF (n—- 1) +--+ +0F(0) = Fy. 
itj=n 


Thus XoFf = F, and similarly FXo = F. We let the reader verify the remaining ring 
axioms, namely: the zero sequence is the additive identity in K[[z]]; the additive inverse of 
(Fy, :n > 0) is (—F, : n > 0); addition in K|[z]] is commutative; multiplication in K[[z]] is 
commutative (this uses commutativity of A’); and the distributive law holds. 

For (b), observe first that the map a+ (a,0,0,...) is a bijection of K onto the subset of 
K|[z]] consisting of sequences that are zero after position zero. The definitions immediately 
show that 


(a,0,0,...) + (b,0,0,...) = (a+0,0,0,...); 


—(a,0,0,...) = (—a,0,0,...); 
(a,0,0,...) x (b,0,0,...) (ab, 0,0,...); 


I 


and furthermore, 0x +> (0,0,0,...) = Ox,j2}) and lx + (1,0,0,...) = lx i2y. This shows 
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that operations in K[[x]] on sequences of this form agree with the corresponding field op- 
erations in K. So we can view K as embedded in K'[x]] by means of this bijection. (More 
formally, we have found an “isomorphic copy” of K inside K'[[z]].) 

(c) Define scalar multiplication in K[[z]] by setting cF = (cF, :n € N) force K 
and F' € K|{2]]. One checks that cF' is the same as the product of (c,0,0,...) and F = 
(Fo, Fi, Fo,...) in the ring K|[z]]. Using this observation, one sees immediately that K|[z]] 
satisfies all the axioms for a vector space over K, because the required identities are special 
cases of the ring axioms that have just been verified. O 


DT 


7.2. Finite Products and Powers of Formal Series 


Now that we know K|[2]] is a ring, we can iterate the binary operations of addition and 
multiplication to define finite sums and products of formal power series. Similarly, for any 
integer n > 0 and any G € K|[z]], the power G” is defined recursively by setting G° = 1 
and, for n > 0, setting G’+! = G” - G. Intuitively, G” is the product of n factors all equal 
to G. Later, we will see that infinite sums and infinite products of formal power series can 
be defined in certain situations. We will also obtain a criterion for when the multiplicative 
inverse G~! (and other negative powers of G) can be formed. 


7.9. Example: Powers of x. For 7 € N, define X; = }7,59 x(n = 7)2” as in 7.5. We claim 


that Xi = X; for all i > 0. The claim holds when i = 0 since X}? = lx ej by definition, 
and we saw in 7.8 that 1 ij.) = Xo. Fix 7 > 0 and assume by induction that Xi = X;. Now 


Xp (n) = (XP Malm) = (Ki Kiln) = YP Xil@).Xilb)— (EN). 
a+b=n 


The only choice of (a,b) that produces a nonzero summand is a = 7 and b = 1, which can 
occur only for n =i+1. So X?*1(n) =0 = Xiqi(n) ifn Ai4+1, and X#71G41) =1= 
Xisi(i +1). Thus X}*! = Xj41, verifying the claim for i+ 1. 

If we define x to be the particular formal power series X; € K/[z]], the claim shows 
that 2’ = X; for all i > 0. We have now justified our earlier “notational abuse” in 7.5. 
Furthermore, for any finite sequence of scalars co,ci,...,cn € K, define C € K[x]] by 
letting C(n) = cn for0 <n < N and C(n) =0 for n > N. Then the definition of addition 
and scalar multiplication shows that 


co tere +e9n7 +---+enar™ = (c0,c1,..-,¢n,0,0,...) =C= S05, 
n>0 
where the leftmost expression is built up from « = X, and the c;’s by algebraic operations in 
K|[z]], and the rightmost expression is our atomic notation for the series C’. Later, after we 
give a meaning to infinite summations of formal power series, we will see that the analogous 
identity 
Cot Cint+ Cox? + +--+ Caa® ++--=C= > Cpa” 
n>0 

is also valid for any C € K|[a]]. 


7.10. Theorem: Products of k Series. Suppose G1, G2,...,G, € K|[2]]. For alln € N, 


(GiGa---Gx)(n) = a G1 (j1)G2(J2) +++ Gain): 


it sd25+-sde)EN*: 
Jitjate +tjRan 
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Proof. We use induction on k. The case k = 1 is immediate, and the case k = 2 holds by 
the definition of the product of two formal power series. Assume k > 2 and the result is 
already known for products of k — 1 series. Letting F = G,G2---G ,_1, we calculate 


(GiG2:--Ge)(n) = (FGx)(n)= So F(r)Ge(s) 


(r,s) EN? 
r+s=n 


=> oy S- Gi (j1)G2(Jo) -+- Ge—-1(de-1) | Gr(s) 
(r,s)EN? \ (51,525---5e—1) EN: 
r+s=n Jitje+e+tjr-1=T 

SS Gi (j1)G2(Ja) >> Ge—-1(r-1) Ge (Jr). 


(i192 5---.5m)EN*: 
Jitjate tjIR=n 


l| 


The last step follows by the generalized distributive law and a change in the names of the 
summation indices. O 


Now we can prove a result, similar to the multinomial theorem 2.12, that lets us compute 
a power of a formal series. 


7.11. Theorem: Powers of Formal Series. For all G € K|{z]] and all m,n EN, 
m 
Gm = G0)F° G(1)™t --- Gn) Fn. 
(n) Ct est, MEM Un) 
(ko,k1,...,kn)EN"*?: 
; ki=m, do; tki=n 


Proof. Applying 7.10 with k = m and all G;’s equal to G, we see that 


G™(n)= SY) G(r) G2) ++» Gm). 
(J15--5Jm)EN™: 
Jit +jm=n 


Given (ko, ki,..-,kn) € N+? satisfying 5°,k; = m, >>, ik; = n, let us group together 
all the summands indexed by sequences (j1,.--,jm) € R(0*°1*!---n*”). The number of 


such summands is the multinomial coefficient a ky ck ) by 1.46, and (by commutativity 


of multiplication in A’) every such summand is equal to G(0)°G(1)* ---G(n)*n. Summing 
over all possible choices of (ko,...,kn) gives the stated formula for G™(n). (Compare to 
the proof of 2.12.) 


7.3 Formal Polynomials 


Polynomials, like the power series studied in calculus, are often regarded as functions of a 
real variable x. For a function p : R — R to be a polynomial, there must exist constants 
a; € Rand n€ N such that p(x) = agp +a,2+aqx74+---+ay2x" for all x € R. One can prove 
that the coefficients a; are uniquely determined by p. This functional view of polynomials is 
not really necessary for many algebraic and combinatorial purposes. We now give a rigorous 
discussion of “formal” polynomials and their algebraic properties. Some of these properties 
will follow quickly from results already proved for formal power series. 
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7.12. Definition: Formal Polynomials. A formal power series F € K|[z]] is a polynomial 
iff {n EN: F(n) 4 0} is a finite set. Let K[a] be the set of all polynomials in K[[z]]. 


Intuitively, a polynomial is a formal power series with only finitely many nonzero coef- 
ficients. 


7.13. Definition: Degree of a Polynomial. Given a nonzero polynomial f € K[z], the 
degree of f, denoted deg(f), is the largest n € N with f(n) 4 0. The element f(n) € K is 
the leading coefficient of f. A polynomial f is called monic iff f(n) = 1. The degree of the 
zero polynomial is undefined. 


7.14. Theorem: Properties of Degree. For all f,g € K[z], (a) f+g is a polynomial, and 
deg(f +g) < max(deg(f),deg(g)) whenever both sides are defined. (b) fg is a polynomial. 
If f and g are nonzero, then fg 4 0, and deg(fg) = deg(f) + deg(g). 


Proof. (a) Certainly f + g is a polynomial if f = 0 or g = 0. Otherwise, let n = deg(f), 
m = deg(g), and k = max(m,n). For alli > k, (f + 9)() = f(t) + 9(4) =0+0 = 0. On one 
hand, this shows that {i € N: (f +g)(4) #0} C {0,1,...,k}, so that f+ is a polynomial. 
On the other hand, this also shows that deg(f +g) <kif ft+g#0. 

(b) Certainly fg is a polynomial if f = 0 or g = 0. Now assume f is nonzero of degree 
n, and g is nonzero of degree m. Thus f(i) = 0 for all 2 > n and g(j) = 0 for all 7 > m. 
Suppose k > n +m and (i,j) € N? satisfy i+ 7 = k. Then we must have either i > n 
or j > m. Thus, every summand in the expression (fg)(k) = )0j,;-. f()g(9) is zero, 
so (fg)(k) = 0 for all k > n+™m. This shows that fg is a polynomial. We also have 
(fg)(n +m) = Vis jenim [WO9V) = f(nr)g(m) since the only nonzero summand occurs 
when i = n and j = m. Now f(n)g(m) 4 0 since f(n) 4 0 and g(m) £0 and K is a field 
(cf. 7.27 below). Thus, (fg)(n +m) 4 0, whereas all higher coefficients of fg are zero. We 
now see that fg #0 and deg(fg) =n +m = deg(f) + deg(g). oO 


7.15. Theorem: Algebraic Structure of K[z]. (a) K[2] is a commutative ring containing 
the field K. (b) K [2] is a vector space over K with basis {X; : i > 0} = {a’ : i > 0} (see 7.9). 


Proof. (a) We have just seen that K[z] is closed under addition and multiplication. All 

the other ring axioms in 2.2 follow automatically from the corresponding ring axioms for 

K|[z]], once we notice that —f is a polynomial whenever f is, and Ojjj21) and 1xjaj are 

polynomials. More generally, for any c € K, every power series of the form (c,0,0,...) isa 

polynomial. So K[z] contains the field K (or, more precisely, the copy of K inside K|[z]]). 
(b) Given a nonzero polynomial f € K[a], let n = deg(f). We see that 


f(O)Xo + fA)Xi +++ + f(n)Xn = f, 


since both series take the same value at every k © N (cf. 7.9). Thus {X; : 7 © N} isa 
spanning set for the vector space K [x]. To see that this set is linearly independent, consider 
a finite linear combination 


Ci, XG, + Cig Xig ++++ +6, XG, = 9, 


where the 7;’s are distinct indices and each c¢;, € K. Evaluating the left side at 2;, we see 
that c;, = 0 for all 7. Thus, {X;: 7 > 0} is a basis of K [a]. Oo 


Note that B = {x’ : i > 0} is a basis for K[z] but not a basis for K[[z]]. The set B 
does not span K[[z]], because we are not allowed to form “infinite linear combinations” 
9 ew’ when determining the span (in the linear-algebraic sense) of the set B. It is true 
that K|[2]] has some basis (as does every vector space) — but this basis will be much larger 
than the collection B and cannot be specified explicitly. 
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7.16. Example. The sequences f = (2,0,1,3,0,0,...) and g = (1,—1,0,—3,0,0,...) are 
polynomials of degree 3, which can be written in terms of the basis B as f = 2+ 274 323 
and g =1—2— 32°. We calculate 


ftg=3-242’, fg =2-22 +27 — 42° — 324 — 32° — 92°. 


We have deg(f) = 3 = deg(g), deg(fg) = 6 = deg(f) + deg(g), and deg(f +g) = 2 < 
max(deg(f), deg(g)). We see that strict inequality occurs in the last formula, since the 
leading coefficients of f and g cancel in f +g. 


We stress once more that formal polynomials are not “functions of «.” Two formal 
polynomials f = 0,39 fnv” and g = 0,509 gnx” are equal iff f, = gn for all n € N. This 
is true by the definition of formal polynomials as sequences (functions with domain N) and 
is equivalent to the linear independence of the basis B. Nevertheless, we can use a formal 
polynomial to define an associated polynomial function, as follows. 


7.17. Definition: Polynomial Functions. Given a nonzero polynomial f € K[z] and a 
commutative ring R containing the field K, the polynomial function associated to f with 
domain R is the function Py : R — R defined by 


deg(f) 
0 


Py(z)= >> faz” (2 ER). 


If f =0, we let Pr(z) =0 for all z € R. 


One can show that, because R contains the infinite field Q, f = g (equality of for- 
mal polynomials) iff Py = P, (equality of functions). However, this statement fails if one 
considers polynomial functions defined on finite rings and fields (see the exercises). 


7.18. Example. Let f = 2+27+32? € Q{z], and let R = C (the complex numbers). Then 
P; (V2) = 2 + (V2)? +3(V2)8 =44+6V2; =P (8) = 2477 +382 = 1-33. 


7.19. Example. Let h = 1—2+4+ 2? € Q[a], and let R = Q[z]. Then P;,(2x3) = 1 — (223) + 
(2x3)? = 1—223+42° and P,(h) = 1-—(1—2+a7)+(1l-av+a7)? = 1—2+4+22? -223 +271. 
Note also that P;,(x) = 1—2+ 2? =h; more generally, Pr(x) = f for any f € K[z]. Next 
suppose R = Q[[a]] and z = >...) 2” € R. Then 


PA) 2600/00 2. So ae 2345 PS (Se...) 


Intuitively, the next result confirms that the algebraic operations on formal polynomials 
agree with the familiar algebraic operations on polynomial functions. 


7.20. Theorem: Comparison of Algebraic Operations on Formal Polynomials and 
Polynomial Functions. Let f,g € K[z], let ce K C Kz], let R be a commutative ring 
containing the field K, and let z € R. (a) Pr+g(z) = Pr(z)+Po(z). (b) Pg (z) = Pp(z) Pa(z). 
(c) Pe(z) =e. 


Proof. We prove (b), leaving (a) and (c) as exercises. Both sides in (b) are zero if f = 0 or 
g = 0, so assume f £0 and g £0. Write n = deg(f) and m = deg(g), so deg(fg) =n+™m. 
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Now, compute 


n+rm 


S~ (fg)k2" (definition of Pr.) 
k=0 


Prg(z) 


n+rm 


= SS oo fi9 2k (definition of fg) 


=0 \itj=k 


> 


n+rm 


_ S- ye (fiz'gjz’) (distributive law in R and commutativity of R) 
k=0 itj=k 


= (> ie!) ys gj | (generalized distributive law in R) 
i=0 j=0 


= Py(z)P,(z) (definition of Py and P,). O 


We can rephrase this result in a somewhat more sophisticated way, using the following 
definition. 


7.21. Definition: Ring Homomorphisms. Let R and S be rings. A map f: R > S 
is a ring homomorphism iff f(1r) = 1g and for all z,y € R, f(x +y) = f(x) + f(y) and 
f(xy) = fx) Fy). 


7.22. Theorem: Evaluation Homomorphisms on K [2]. Suppose R is a commutative 
ring containing the field kK. For each z € R, there exists a unique ring homomorphism 
ev, : K[z] > R such that ev,(a) = z and ev,(c) =c for all c € K. Furthermore, ev,(f) = 
P;(z) for all f € K [a]. We call ev, the evaluation homomorphism on K|x] determined by 
evaluating x at z. 


Proof. The particular map ev.(f) = Py(z) is a ring homomorphism sending x to z and 
fixing K, since (by the previous theorem) for all f,g € K[z] andce K, 


ev.(f+g) = Prtg(z) = Pelz) + Polz) = ev.(f) + eve(9); 
ev.(fg) = Pyg(z) = Pr(z)Py(z) = eve(f) ev-(9); 
ev.(c) = P(z)=c (so evz(1K[2)) = 1n); 


and ev,(z) = P,(z) = z. To prove uniqueness, let E : K [x] — R be any ring homomorphism 
with E(«) = z and E(c) =c for c € K. For a nonzero f € K [2] of degree n, we have 


E(f)=£ bs ju) => E( fat) = 5° E(fe)E(2)* = > frz* = Pp(z) = eve(f). 
k=0 k=0 k=0 


As E(0) = Or = ev.(0), we conclude that E = ev,. O 


> 
Il 
i) 


7.4 Order of Formal Power Series 


We now discuss the order of a formal power series, which is analogous to the degree of a 
formal polynomial. The degree of a polynomial F is the largest n with F(n) 4 0. Such ann 
will not exist if F is a formal power series that is not a polynomial. So we instead proceed 
as follows. 
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7.23. Definition: Order of a Formal Power Series. Let F = )¢.) Fra” € K|[[a]] be 
a nonzero series. The order of F’, denoted ord(F’)), is the least n > 0 such that F(n) 4 0. 
Since {k € N: F(k) 4 0} is a nonempty subset of N, the order of F' is well defined. The 
order of the zero series is undefined. 


7.24, Example. The order of (0,0,0,4,2,1,4,2,1,4,2,1,...) is 3. Nonzero polynomials 
have both a degree and an order; for instance, 2? + 2° + 3a" has degree 7 and order 2. 


The properties of order are analogous to the properties of degree. 


7.25. Theorem: Properties of Order. Let F and G be nonzero formal series in K’[[x]]. 
(a) If F+G £0, then ord(F' + G) > min(ord(F’), ord(G)). (b) FG 4 0, and ord(F'G) = 
ord(F’) + ord(G). 


Proof. Let n = ord(F’) and m = ord(G). (a) Let k = min(n,m). For any i < k, (F + G); = 
F,+G;=0+0=0, so ord(F' + G) > k. 

(b) For any p< n+™m, (FG)p = D044 j;<p FiGj. For any pair (i,j) ¢ N* with sum p, 
either 1 < n or j < m. Thus F; = 0 or G; = 0, so every summand F;G; = 0. Hence, 
(FG), = 0. On the other hand, for p= n+, we only get a nonzero summand when i = n 
and j =m. So (FG)nim = FnGm 4 0, since F, #0 and G,, £0 and K is a field (cf. 7.27 
below). This shows that FG ¥ 0 and that n+ is the least element p of N with (FG), 4 0, 
hence ord(F'G) =n + m = ord(F) + ord(G). oO 


7.26. Definition: Integral Domains. A commutative ring R with more than one element 
is an integral domain iff for all nonzero x,y € R, cy ~ 0. Equivalently, for all z,y € R, 
zy = 0 implies x = 0 or y = 0. 


7.27. Example. Every field F (see 2.3) is an integral domain. For if x,y € K, xy = 0, and 
x #0, then x has a multiplicative inverse in kK. Multiplying xy = 0 by the inverse of x, we 
see that y = 0. The ring Z is an integral domain that is not a field. Part (b) of 7.14 shows 
that K [2] is an integral domain. Part (b) of 7.25 shows that K'[[z]] is an integral domain. A 
key step in both proofs was the deduction that F,,G, 4 0 since fF, 4 0 and G,,, 4 0. Thus, 
these proofs tacitly used the fact that the field K is an integral domain. This hypothesis 
on K (which is weaker than assuming that K is a field) is enough to ensure that K [a] and 
K|[z]] will be integral domains. 


DS 


7.5 Formal Limits, Infinite Sums, and Infinite Products 


Even in the algebraic setting of formal power series, one can imitate the limiting operations 
that play such a prominent role in calculus. In particular, we can use formal limits of formal 
power series to define infinite sums and infinite products of formal power series in certain 
situations. 


7.28. Definition: Limit of a Sequence of Formal Power Series. Suppose (Fi), : k € N) 
is a sequence of elements of K[[z]] (so Fy, : N— K for each k € N), and G € K[z]]. Write 


jim Fy =G (or F, — G) 


iff for each n > 0, there exists an index A(n) such that k > K(n) implies F,(n) = G(n). 
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Informally, the sequence (Fi, : k € N) of formal power series converges to some (neces- 
sarily unique) limit series in K'[[z]] iff for each n € N, the coefficient of x” in Fj, eventually 
becomes constant for large enough values of & (and this constant is the coefficient of 2” in 
the limit series G). 


7.29. Example. We have limp... x* = 0 in K|[z]]. To prove this, fix any n and then note 
that k > n implies x*(n) = 0 = O(n). 


Now that limits are available, we can define infinite sums (resp. products) as limits of 
partial finite sums (resp. products). 


7.30. Definition: Infinite Sums and Products of Formal Series. Suppose (Fy : k € 
N) is a sequence of formal power series in K'|[a]]. For each N > 0, let Gy = Fo + Fi + 
+--+ Fy € K|[a]] be the Nth partial sum of this sequence. If H = limy.. Gy exists in 
K[[z]], then we write 77°.) Fe = H. Similarly, let Py = FoF, +: Fy € K|[a]] be the Nth 
partial product of the sequence of F;,’s. If Q = limy—oo Py exists in K|[2]], then we write 
TI, Fe = Q. Analogous definitions are made for sums and products ranging over any 
countably infinite index set (e.g., for k ranging from 1 to oo). 


7.31. Example. Given F = $0) Fnv” € K|[a]], define a formal series Gy = F,.X% = 
F,,a* (see 7.9) for each k > 0. We have 


SG =F = Piya 004), 
k=0 k=0 


Given m €N, it follows that the coefficient of «™ in any partial sum 7)» Gg with n > m 
is F,, = F(m). Thus, by definition, )°>7°.9 G;, has limit F’. In other words, 


3 F,X;, =F = 5 F,2*, 
k=0 k=0 


where the left side is an infinite sum of formal power series, and the right side is our notation 
for the single formal power series F’. This equality finally justifies the use of the “power series 
notation” for elements of K'|[a]]. 


The previous example can be rephrased as follows. 


7.32. Theorem: Density of Polynomials in K|[z]]. For each F' € K|[z]], there exists 
a sequence of polynomials f,, € K[a] such that lim,—.o fy, = F. Specifically, we can take 
fa = er Fy, 


Testing the convergence of infinite sums and products of real-valued functions is a deli- 
cate and often difficult problem. On the other hand, we can use the notion of order to give 
simple and convenient criteria ensuring the existence of infinite sums and infinite products 
of formal power series. Recall from calculus that a sequence of integers (en : n > 0) tends to 
infinity (in R) iff for every integer K > 0, there exists N such that n > N implies e, > K. 


7.33. Theorem: Existence Criteria for Limits of Formal Series. Suppose (Fy : k € 
N) is a sequence of nonzero formal power series in K'|[2]]. 

(a) limpoo F, = 0 in K|[a]] iff lim, ord(F,) = co in I 
(b) OP Fe exists in K[[x]] iff lim,_.. ord(Fi,) = 00 in I 
(c) If F,(0) = 0 for all k, then []7°.9(1+ Fy) exists in K[[x]] iff limp... ord(F,) = co in R. 


ed 
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Proof. (a) Assume F,, — 0 in K[[a]]. Choose a fixed integer M > 0. For each n between 
0 and M, there exists an index k, such that k > k, implies F;,(n) = 0. Hence, whenever 
k > K = max(ko,ki,..., ky), we have Fy,(n) = 0 for alln < M. It follows that ord(F;,) > M 
whenever k > K. This proves that the sequence of integers (ord(F;,) : & € N) tends to infinity 
as k goes to infinity. Conversely, suppose ord(F,,) — oo as k > oo. Fix n, and choose kK 
so that k > K implies ord(F;,) > n. It follows immediately that Fi,(n) = 0 = O(n) for all 
k > K. Thus, F, — 0 in K[[a]]. 

(b) Suppose >7,5) Fe converges to G in K[{a]]. Given an index n, we can therefore 
choose K so that k > K implies 


(Fo + Fy oy ea Fy) (n) = G(n). 
Given k > K, note that 
Fi,(n) = (Fo + +++ + Fh)(n) — (Fo +--+ + Fr-1)(n) = G(n) — G(n) = 0. 


This proves that Fi, > 0 as k — oo, and hence ord(F;) — co by (a). Conversely, suppose 
ord(F.,) — co, so that F, — 0 by (a). For each fixed n, the coefficient of «” in Fy is 
eventually zero, and hence the coefficient of x” in the partial sum Fo + Fi + --- + Fy 
eventually stabilizes. Thus, these partial sums have a limit in K'[[x]]. 

(c) Suppose the indicated infinite product exists. We must show that for all n, ord(Fx) 
is eventually > n. We prove this by induction on n. The statement is true for n = 1 
since F;,(0) = 0 for all k. Assume the statement holds for some n > 1. Choose ko so that 
ord(F,) > n for all k > ko. Next, using the hypothesis that the infinite product exists, 
choose k; so that j,k > k, implies 


J k 
Teer (n) = ite (n). 
1=0 1=0 
Note that 
k k-1 k-1 k-1 
[[¢+4)-][¢+ <= |[[G+)| a+"4-)=A][G+). (7.1) 
1=0 i=0 1=0 1=0 


For k > ky, the coefficient of x” on the left side is zero. On the other hand, for k > ko, the 
fact that ord(F;,) > n implies that 


k-1 


Fi [[a + F;) 


i=0 


(n) = Fe(n), 


since the partial product has constant term 1. Combining these facts, we see that Fi,.(n) = 0 
for k > max(ko, k). This shows that ord(Fj,) eventually exceeds n, completing the induc- 
tion. 

Conversely, suppose ord(F;,) — oo as k — oo. Fix n; we must show that the coefficient 
of x” in the partial products fd + F;) eventually stabilizes. Choose kp so that k > ko 
implies ord(F},) > n. It suffices to show that for all k > ko, 


k k-1 
The +9] = [Toy 


i=0 
Subtracting and using (7.1), we see that the condition we want is equivalent to 


k-1 


F, [J+ i) 


i=0 


(n)=0  (k> ko). 


This holds because the product appearing on the left side here has order greater thann. O 
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7.34. Example. The infinite product [[>-_,(1 +2”) is a well-defined element of K[[z'], 
since ord(z") =n > co asn > oc. 


7.35. Theorem: Limit Rules for Sums and Products. Suppose Fy, Gn, P,Q € K|[x]] 
are formal series such that F, — P and G, — Q. Then F,+G,—- P+Q and F,G,n—- PQ. 


Proof. We prove the second statement, leaving the first as an exercise. Fix m € N. We must 
find N € N so that n > N implies (FnGn)(m) = (PQ)(m). For each 7 < m, there is an 
N; € N such that n > N; implies F,,(j7) = P(j). Similarly, for each k < m, there is an My € 
N such that n > M;, implies G,(k) = Q(k). Let N = max(No,...,Nm,Mo,.-.-, Mm) € N. 
For any n> N, 


(PQ)\(m) = S> PU)Qh) = YS) FrG)Gn(b) = (FrGn)(m). Oo 


jtk=m jtk=m 


7.6 Multiplicative Inverses in K|z] and K|[z]| 


In any ring S, it is of interest to know which elements of S have multiplicative inverses in 


S. 


7.36. Definition: Units of a Ring. An clement x in a ring S is called a unit of S iff there 
exists y € S with ry = yx = 1g. 


Suppose y,z € S' satisfy cy = yx = lg and xz = zx = 1g. Then y = yl = y(az) 
(yx)z = 1z = z, so y = z. Thus, if x has a multiplicative inverse in S, this inverse is unique. 
We write «~! or 1/x to denote this inverse. 


7.37. Example. If || > 1, then 1g 4 0g, and zero is not a unit of S. By definition, every 
nonzero element of a field F’ is a unit of F. In particular, the units of Q are the nonzero 
rational numbers. On the other hand, the only units of the ring Z are 1 and —1. So 2 isa 
unit of Q but not a unit of Z. 


Next we characterize the units of the polynomial ring K[z] and the formal power series 
ring K|[a]]. Our first result says that the only units in the polynomial ring K[z] are the 
nonzero scalars. 


7.38. Theorem: Units of K[z]. A polynomial f € K[z] is a unit in K [2] iff deg(f) = 0. 


Proof. The zero polynomial is not a unit of K[z], so assume f 4 0 henceforth. First, suppose 
f = aox° is a degree zero polynomial, so ay € K is nonzero. Since K is a field, ag‘ exists 
in K. In K[z], we have agay | =1k[2) = ap ao, sO ap is also a multiplicative inverse for f 
in the ring K [a]. Thus, f is a unit of K[z]. 

Conversely, suppose deg(f) > 0. For any nonzero g € K [x], we know deg(fg) = deg(f)+ 
deg(g) > 0 (this result uses the fact that K is a field). Thus fg 4 1x{2], since deg(1K{2]) = 0. 
So f does not have a multiplicative inverse g in K [a]. O 


Intuitively, a non-constant polynomial cannot be a unit since there is no way to get rid 
of the positive powers of xz. Perhaps surprisingly, when we pass to the larger ring of formal 
power series, almost every element in the ring becomes a unit. More precisely, every series 
with nonzero constant term has an inverse in K’|[x]]. Before proving this, we consider an 
example that occurs frequently. 
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7.39. Example: Formal Geometric Series. Consider the series F = (1, —1,0,0,0,0,...) 
and G = (1,1,1,1,1,...). Multiplying these series, we discover that 


FG =GF = (1,0,0,0,...) = Le] 
Thus the polynomial 1 — x is invertible in K[[]], and 


(l-a)t=1l+a+a? +. +a +---= Soo". 
n>0 


This is a formal version of the “geometric series formula” learned in calculus. In calculus, 
one requires |z| < 1 to ensure convergence. There is no such restriction here, since the letter 
x in our formula does not denote a real number! 


7.40. Theorem: Units in K|[z]]. A formal power series F' € K[[z]] is a unit in A [[a]] iff 
F(0) 40. 


Proof. Assume F'(0) = 0. For any G € K|[z]], (FG)(0) = F(0)G(0) = 0 F 1 = 1xqay(0). 
So FG Alin K|[a]], and F is not a unit of K|[a]]. 

Conversely, assume F'(0) # 0. Our goal is to find a series G = 0,5, Gna” such that 
FG = GF = 1xjaqj. The desired equation FG = 1 holds in K'|[x]] iff the following infinite 
system of equations holds in the field Kk: 


foGo = 1 
FoG, al F, Go = 0 
eae 02 
eb FGn—k = 0 
We claim that there exist unique scalars Go, G,...,Gn,... solving this system. To prove 


existence, we “solve the preceding system for the unknowns G,.” More precisely, we re- 
cursively define Go = Fy’ € K, Gi = Fy '(—FiGo), G2 = Fy'(—FiGi — F2Gp), and in 
general 


Gn =—-Fo'S) FeGn—k. (7.3) 
k=1 


By construction, the scalars G,, € K defined in this way satisfy (7.2), and therefore G = 
so Gnx” satisfies FG = GF = 1. Since G = F~! in K[[z]], G (and hence the G,,) are 
uniquely determined by F. Oo 


The preceding proof gives a recursive algorithm for calculating any given coefficient of 
1/F in terms of the coefficients of F' and lower coefficients of 1/F'. In some situations, the 
following theorem (which generalizes 7.39) gives a more convenient way to calculate the 
multiplicative inverse of a formal series. 


7.41. Theorem: Formal Geometric Series. If G € K|[1]] satisfies G(0) = 0, then 


1 foe) 
(1- G4 = RB H1t G+ @ + +. => Gte Ke] 
k=0 


Proof. The theorem holds if G = 0, so assume G is nonzero. Suppose ord(G) = d > 0; 
then ord(G*) = kd, which goes to infinity as k > oo. It follows that H = }7°°., G* exists. 
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Consider the coefficient of «” in (1 — G)H. By order considerations, this coefficient is the 
same as the coefficient of x” in 


(1-G)\1+G+G@?4+---+G")=1-q@"", 


Since G"*1(n) = 0, we see that the coefficient in question is 1 if n = 0 and is 0 if n > 0. 


Thus, (1- G)H = H(1—G)=1,so H=(1-G)"1. Oo 
7.42. Example. Taking G = 2’ in the theorem (where i > 1 is a fixed integer), we find 
that 

1 i Qi i 


This series has the form 1+ F' where ord(F’) = i. It follows that the infinite product 
TH, 4 exists. Similarly, []?° , (1— 2") exists because ord(—2x') = i, which goes to infinity 
as 7 goes to infinity. Next, we claim that 


Il¢-9 [I= =1. 
i=l i=l 


One might at first believe that this identity is automatically true (due to “cancellation” ), 
but care is required since we are dealing with formal infinite products. To justify this identity 
carefully, let P, = []j_,(1 — 2") and Q, = []j_,(1 — 2")~? for each n € N. Using 7.35, we 
see that 


TG eDiets) RCs) = A ee edit 
Note that we can rearrange the factors in each finite product P,Q, (since multiplication is 


commutative) and cancel to obtain 1. 


7.43. Example. The geometric series formula can be used to invert any invertible formal 
power series, not just series with constant term 1. For, suppose F = )),,., Fnv” where 


Fo # 0. We can write F = Fo(1 — G) where G = )7,5,(—Fo 'Fn)x”. Then 


FS FO'1-G) 1 =F) '[1+G+@?+---+G*t+---]. 


7.7 Formal Laurent Series 


We saw in the last section that G € K[[2]] is a unit in A[[s]] iff G(0) 4 0. Sometimes we 
want to divide by elements of K|[:]] that are not units. To do this, we need to operate in 
the field of fractions of the integral domain K'|[x]]. We summarize this construction now, 
omitting many routine details; more thorough treatments may be found in texts on abstract 
algebra. 


7.44. Construction: Field of Fractions of an Integral Domain. Let D be an integral 
domain, and let D* be the set of nonzero elements of D. Let X = Dx D* be the set of pairs 
(a,b) where a,b € D and 6 is nonzero. Define a relation ~ on X by setting (a,b) ~ (c,d) 
iff ad = bc. One may verify that ~ is an equivalence relation (checking transitivity requires 
the assumption that D is an integral domain). Write F' = Frac(D) to denote the set of 
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equivalence classes of this equivalence relation; also, write a/b to denote the equivalence 
class of (a, b). 
Given two elements a/b and c/d in F’, define addition and multiplication as follows: 


(a/b) + (c/d) = (ad + be)/(bd); (a/b) x (e/d) = (ac)/(bd). 


It must be checked that these operations are independent of the representatives chosen 
for the equivalence classes. For example, one must show that a/b = a’/b' and c/d = c'/d' 
imply (ad + bc)/(bd) = (a’d’ + b'c’)/(b'd'). Also define zero and one in F by Or = Op/1p 
and 1p = 1p/1p. One may now check that (F,+,x,0r,1r) is a commutative ring with 
lp # Op. The map i: D — F such that i(a) = a/1p for a € D is an injective ring 
homomorphism that embeds D as a subring of F' and allows us to regard D as a subset of 
F. Finally (and this is the point of the whole construction), every nonzero element a/b € F 
has a multiplicative inverse in F’, namely b/a. This follows since (a/b) x (b/a) = (ab)/(ba), 
and this equals 1p/1p = 1r because (ab)1 = (ba)1 in D. Therefore, F is a field. 

The field F has the following universal mapping property: for any ring homomorphism 
g:D-— L into a field L, there exists a unique ring homomorphism g’ : F > L extending 
g (more precisely, such that g = g' 07). This homomorphism must be given by g/(a/b) = 
g(a)g(b)~+ € L (proving uniqueness); for existence, one checks that the formula just written 
does give a well-defined ring homomorphism extending g. 


7.45. Example: Z and Q. Let Z be the ring of integers, which is an integral domain. The 
field Q of rational numbers is, by definition, the field of fractions of Z. 


7.46. Definition: Formal Rational Functions. The symbol K(x) denotes the fraction 
field of the integral domain K [a]. Elements of K(x) are formal rational functions p/q, where 
p,q are polynomials with g nonzero; we have p/q = s/t in K(«) iff pt = qs in K [a]. 


7.47. Definition: Formal Laurent Series. The symbol K((x)) denotes the fraction 
field of the integral domain K'[[z]]. Elements of K((x)) are formal Laurent series in one 
indeterminate. A formal Laurent series is a quotient G/H, where G,H are formal power 
series with H nonzero; we have G/H = P/Q in K((z)) iff GQ = HP in K[[2]]. 


We can use our characterization of units in K’[[2]] to find a canonical description of the 
elements of K((x)). 


7.48. Theorem: Representation of Laurent Series. For every nonzero S € K((x)), 
there exists a unique integer N and a unique series F € K|[[z]] such that S = x% F and 


F(0) £0. 


7.49. Remark. We call N the order of S. When N > 0, so that S € K[a]], this is 
consistent with the previous definition of ord($). When N = —m is negative, S = x F is 
the fraction F'/x™. In this case, we often use the “Laurent series notation” 


[oe 
S = For ™ + Pye! + Poe et Pye + Faget! +¢-+ = S- Y caries ike 


n=—-mMm 


Proof. Given a nonzero S € K((x)), there exist nonzero series G, H € K|[x]] with S = G/H; 
Gand H are not unique. Write ord(G) =i, ord(H) =j7,G = 3°,,5,;Gnxv", H = eee Ayx”, 
where G; and H; are nonzero. Let H* = 3°59 Hn4j2", which is a unit in K[[2]] since 
H*(0) = H; #0. Let Q € K|[2]] be the inverse of H*, which satisfies Q(0) 4 0. Similarly, 
write G* = )0 55 Gn+iv” and note that G*(0) 4 0. Now, 

G x'G* x’G*Q 


N 
S-F-gE gro" 
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if we set N=i-—j € Zand F=G*Q € K|[a]]. This proves existence of N and F. 

For uniqueness, assume xN F = x” P for some M € Z and some P € K|[x]] with 
P(0) 40. Choose k > max(|N], ||); then 2**+% F = 2*+™ P, Both sides are nonzero series 
in K|[z]] and hence have an order. Since F and P have nonzero constant term, comparison 
of orders gives k + N = k+M and hence N = M. Dividing by x (which is a unit in 
K((x))!), we see that F = P. Oo 


7.50. Remark. The proof shows that, for any representation of a nonzero S € K((a)) as 
a quotient F/G with F,G € K/|[z]], we have ord(S) = ord(F’) — ord(G). 


7.8 Formal Derivatives 


We now define a formal version of the derivative operation studied in calculus. 


7.51. Definition: Formal Derivatives. Given a formal series F = )0,,.) Frx” € K[[a]], 
the formal derivative of F is 7 
— dF 


Fl= = DF =) o(n+1)Futi2”. 
n>0 


Higher-order formal derivatives are defined recursively by setting F@*+) = (F“))! for k > 1. 
It follows that 
FY) = S "(n+ 1)(nt2)--- (n+ k)Fnsne”. 
n>0 
The integer coefficients appearing in these formulas make sense, since we have assumed that 
K is a field containing Q. 


To give examples of formal differentiation, we introduce formal versions of some familiar 
functions from calculus. 


7.52. Definition: Formal Versions of Exponential, Logarithmic, and Trigonomet- 
ric Functions. Define the following elements in K’|[x]] (recall kK contains Q): 


i 
er (11/9/1761 /24,1/120;.) = ae" 


n>0 
bs 0 gee 
sina = (0,1,0, —1/6,0,1/120,0,...) = Do x(r is odd)-——]—2"; 
=) n/2 
cos x = (1,0, —1/2,0, 1/24,0, -1/720,...) = 2 x(n is seen v a”; 
(-1)"*? n 
log(1 + x) = (0,1,-1/2, 1/3, -1/4,1/5,...) = }> ~——-2". 
n 


n>1 


7.53. Example. Let F = e*, G = sinx, H = cosa, and P = log(1+ 2) in K|[a]]. Using 
the definition of formal derivatives, we find that 


F’ = (1-1,2-(1/2),3-(1/3!),...,(n+1)-(1/(n+1))),...) = (1,1,1/2,1/6,...,1/n!,...) =F. 
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Thus the formal power series e” equals its own derivative, in accordance with the situation 
in calculus. Iterating, we see that f“) = f for all k > 1. Similar calculations show that 
G' = H, H' =—G, and P’ - (1,1,0,0,...) = 1x27}. We can express the last fact by writing 


d 4 
Gz ost +2) =(1+2) 


Formal derivatives obey many of the same differentiation rules that ordinary derivatives 
satisfy. However, each of these rules must be reproved in the formal setting. Some of these 
rules are stated in the next theorem. 


7.54. Theorem: Formal Differentiation Rules. Let F,G, Hy, € K|[x]] and c,cn € K. 
(a) (F+ GY) = F’ +G’ (sum rule). 
) (cF) = c(F”’) (scalar rule). 
/ 
c) For N EN, Oso Cn Hn) = ee 1 Cn Hj, (linear combination rule). 


(b 

( 

(d) 4(2*) = ka’ for all k > 0 (power rule). 

(e) (F G) = F(G’) + (F’)G (product rule). 

(f) If H, — F, then Hj, — F’" (derivative of a limit). 

(g) If S = 0”, A, exists, then S’ = °°, H/, (derivative of an infinite sum). 


Proof. We prove (d), (e), and (f), leaving the others as exercises. 
(d) Recall that «* = dinso X(n = k)az”. The definition of formal derivative gives 


ky _ 
5 (@") = Dm t+ Ix (nt+1=k)x” =~ kx(n=k-1)2" = ka* 
n>0 n>0 
(e) Note on the one hand that 
m+1 
(FG), = (mt 1)-(FG@)mai = (mt1) So FeGmaiee. 

k=0 

On the other hand, 


(FG + E'G)m = (FG")m + (F'G)n 


= SAC BGe 
k=0 j=0 
_ S- Fe(m + 1—k)Gm+i-z + SG + 1) Fj41Gm-j- 
In the first summation, we can let k go from 0 to m+ 1 (which adds a zero term). In 
the second summation, change the summation variable to k = 7 + 1 and add a zero term 
corresponding to k = 0. We get 


m+1 m+1 
(FG +F'G)m = So (mt1—k)FiGmai—r+ S> kFeGm4i-k 
=0 k=0 
m+1 
= So (mt) FeGmsi-k = (FG), 
k=0 


This completes the proof of the product rule. 
(f) Assume limjo. Hi = F. To prove limioo Hj = F’, fix m € N. Choose N so that 
n> N implies Hyn(m-+ 1) = F(m +1). By definition of formal derivatives, n > N implies 


H! (m) = (m+ 1)Apn(m+1) = (m+ 1)F(m+4+1) = F’(m). O 
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A formal version of the chain rule will be given shortly, once we define formal composition 
of two formal power series. We turn next to a formal analogue of the Maclaurin series of a 
real-valued function. 


7.55. Theorem: Formal Maclaurin Series. For all F € K[[z]], F = S0,5 ee a, 


Proof. We have F“*) = Vnsolt + 1)(n + 2)---(n +k) Fn+k2”, so F(®)(0) = klF,. Since K 
contains Q, we can divide both sides by k! in K. Thus, Fy = F“)(0)/k! for alk eN. O 


7.9 Composition of Polynomials 


Given functions f,g : R — R, we can form the composite function go f : R — R by setting 
(go f)(z) = g(f(z)) for each z € R. We would like to introduce a version of this composition 
operation for formal power series. There is an immediate difficulty, since the formal series 
F = Do, s0 Fnx” € K{[x]] is not a function of 2 and need not correspond to a convergent 
power series in the variable x. On the other hand, we saw in 7.17 that a polynomial f € K[z] 
can be viewed as a function Pr : R — R, where R is any commutative ring containing K. 
So we can define the formal composition of two polynomials as follows. 


7.56. Definition: Composition of Polynomials. Given f,g € K[z], the formal com- 


position of f and g is f eg = P(g). More explicitly, if f = °y_, fev” and g = 0 G30, 
then 


k 
n 


feg=>- fe |S aj27 | © Kia. 
j=0 


k=0 


Note that the filled circle e denotes formal composition, whereas an open circle o denotes 
ordinary composition of functions. We can also write feg = evg(f), where ev, : K [xz] — K[z] 
is the evaluation homomorphism that sets x equal to g (see 7.22). 

The following theorem shows that formal composition of polynomials satisfies properties 
analogous to those satisfied by ordinary composition of functions. The proof makes heavy 
use of the properties of evaluation homomorphisms. 


7.57. Theorem: Properties of Polynomial Composition. Suppose f,g,h € K[a] are 
polynomials, c € K, and R is a commutative ring containing K. 

(a) Pfeg = Py o Py : R— R (comparison of formal and ordinary composition). 

(b) (feg)eh=fe(geh) (associativity). 

(c) (f+ g)eh=(feh)+(geh), (f-g)eh = (feh):(geh), coh =o, and (cf)eh =c(f eh) 
(homomorphism properties). 

(d) (f eg)’ =(f' eg)-g' (formal chain rule). 


Proof. (a) Given z € R, we must show that Pfeg(z) = Py(P,(z)). Let us rewrite each side 
in terms of evaluation homomorphisms. The left side is 


Pfeog(z) =ev(f ¢ 9g) =evz(evg(f)) = (evz oevg)(f). 


The right side is 
Ps(Pg(z)) = Pr(eve(g)) = evev. cg) (f)- 


The equality in (a) will follow if we can show that the two ring homomorphisms ev, 0 ev, 
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and eVey.(g) from K [x] into R are equal. By the uniqueness part of 7.22, we need only verify 
that the two homomorphisms have the same effect on the polynomial x. This holds because 


(evz 0 evg)(x) = evz(evg(x)) = evz(g) = eVev,(g) (2). 


(b) In part (a), let R be the ring K [a] and apply both sides to z = h. We obtain 
(fe g)eh = Preg(h) = Ps(Po(h)) = Prgeh) = fe (geh). 


(c) This is a transcription of the statement that ev, is a ring homomorphism fixing all 
elements of kK; for example, 


(fg) eh = evn(fg) = eva(f) eva(g) = (feh)(geh). 


(d) Let us first prove the special case where f = x” for some n > 0. Since f’ = na™~?, 
we must show that 
(g")'=ng""g' (9 € K[z). 
We proceed by induction on n > 0. The base case n = 0 holds because both sides are zero. 
Assuming the formula holds for some n > 0, we use the formal product rule to calculate 


n—-l1 as 


rtt)" = (g"g)’ = (9")'g + 9"g' = ng” *g'g + 9"G = (n+ 1)g"Q’. 


(g 


The general case of (d) now follows because the formula is “K-linear in f.” More precisely, 
if (d) holds for polynomials f; and fo, it also holds for f; + fz because (by (c)) 


(fit fadeg)’ = ((fieg) +(froeg)y’ =(fieg)’ + (frog) 
(fieg)g' + (feeg)g' =(fiegt freg)a’ 
= (fit fr)eg)g' = (fi + fay egg’. 
Similarly, if (d) holds for some f € K [a], then (d) holds for cf whenever c € K. Since every 


polynomial is a finite K-linear combination of powers of z, it follows that (d) is true for all 
polynomials f, as desired. O 


7.10 Composition of Formal Power Series 


We can extend the definition of the formal composition f eg to the case where f € K [a] isa 
polynomial and G € K'[[z]] is a formal series by setting feG = Ps(G) = eva(f), as in 7.56. 
A more challenging problem is to define the composition Fe G when F and G are both 
formal series. To see what can go wrong, suppose F = )7,,..9 Fnv” and G = >0,,.39 Gmz™ 
are formal series. By analogy with the preceding definitions, we would like to define 


n 


FeG=) FG" =) Fx (S Gae* | € K [lz]. 


n>0 n>0 k>0 


The trouble is that the infinite sum of formal series }>,..) FnG” may not be well-defined. 
Indeed, if F;,, 4 0 for infinitely many values of n and Go = 1, consideration of the constant 
term shows that 7,59 FnG" does not exist. However, we can escape this difficulty by 
requiring that the right-hand factor in a formal composition F'e G have zero constant term. 
This leads to the following definition. 
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7.58. Definition: Composition of Formal Power Series. Given F,G € K[[z]] with 
G(0) = 0, the formal composition of F and G is 


FeG=)° FG" € Ka]. 
n>0 


This infinite sum converges by 7.33, because ord(F,G”) > n whenever F,,G” # 0 and so 
the orders of the nonzero summands go to oo as n > 00. 


7.59. Theorem: Identity for Formal Composition. For all F € K[[z]], Fea = F. For 
all G € K|[a]] with G(0) =0,7eG=G. 


Proof. For any F € K|[a]], Fea = S059 Fav” = F. Next, recall that ¢ = X, = 


Troe x(n = 1x”. So, for G € K[[x]] with G(0) =0, 


ceG=S y(n =1)G". 


n>0 
One may check that this infinite sum of formal series converges to G! = G. O 


The next technical result will aid us in proving further facts about formal composition. 


7.60. Theorem: Coefficient in a Composition. For all F,G € K|[zx]] with G(0) = 0 


and allm EN, 
(FeQ)m = by rc") 
n=0 


Proof. Since (F ¢ G)m = (07-9 FnG”),,,, it suffices to show that 


n=0° 7 


P p+l 
(>: rc") = e rc") 
n=0 m n=0 m 


for all p > m. This holds since 


pt+l P 
e rc") = (>. rc") + (Fo41GPt in, 
n=0 m n=0 m 


and F,,,G?*" is either zero or has order at least p+ 1 > m (since ord(G) > 1). O 


m 


7.61. Theorem: Joint Continuity of Formal Composition. Suppose F;,,Gn, P,Q € 
K|[z]] are formal series such that G,,(0) = 0 for alln € N, F, — P, and G, > Q (forcing 
Q(0) = 0). Then F, eG, — PeQ. 


Proof. Fix m € N; we must show that (P e Q)(m) = (F, ¢G,)(m) for all sufficiently large 
n. By 7.60, 


(PeQ)(m) = (32 7W0') : (7.4) 
1=0 


i=0 e 


Now, for each fixed i < m, iteration of 7.35 shows that G?, — Q’ as n — oo. For each 
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fixed 7, the sequence of (order zero) power series F,,(i) converges to P(t) as n — ov, since 
F,, — P. Using 7.35 again, we see that F,,(i)G?, > P(i)Q’ for each fixed i < m, and hence 


(by 7.35) 
im (> F006) =" POO. 
1=0 1=0 


It follows that the right sides of (7.4) and (7.5) do agree for large enough n, which is what 
we needed to show. O 


The previous result combined with 7.32 allows us to use “continuity arguments” to 
deduce properties of composition of formal power series from corresponding properties of 
composition of polynomials. The next few theorems illustrate this technique. 


7.62. Theorem: Homomorphism Properties of Formal Composition. 

Let G € K[[a]] satisfy G(0) = 0. 

(a) For all F, H € Kl[a]], (P+ H)eG=(FeG)+(HeG). 

(b) For all F, H € K[[a]], (FH) eG=(FeG)(HeG). 

(c) For alle € K,ceG=c. 

So, the evaluation map evg : K|[x]] — K|[2]] given by evg(F’) = Fe G is a ring homomor- 
phism fixing kK and sending x to G. 


Proof. We prove (a), leaving (b) and (c) as exercises. Fix F,G,H € K[[z]] with G(0) = 0. 
Use 7.32 to choose polynomials fn, gn, hn € K[a] with g,(0) = 0 for all n, fr — F, gn — G, 
and h,, — H. For each n € N, we know from 7.57(c) that 


(fn + hn) © Gn = (fn Gn) + (hn © Gn): 


Take the limit of both sides as n — oo. Using 7.35 and 7.61, we get (PF + H) eG = 
(F eG) +(H eG) as desired. oO 


7.63. Theorem: Associativity of Formal Composition. Suppose F,G,H € K|[z]] 
satisfy G(0) = 0 = H(0). Then 


(PeGleH=Fe(Gel). 


Proof. First note that all compositions in the theorem statement are defined; in particular, 
Fe (Ge /Z) is defined because Ge H has zero constant term. Use 7.32 to choose polynomials 
firs Gn hn € K[ax] with g,(0) = 0 = h,(0) for all n, fn > F, gn - G, and h, — H. For 
each n € N, we know from 7.57(b) that (fn ¢@ gn) ¢@hn = fn © (gn @ hn). Taking limits and 
using 7.61 repeatedly gives the desired result. O 


A similar continuity argument (left as an exercise) establishes the following differentia- 
tion rule. 


7.64. Theorem: Formal Chain Rule. For all F,G € K|[{2]] with G(0) = 0, 
(FP eG) =(F’ eG)’. 


7.65. Theorem: Inverses for Formal Composition. Let S = {F € K|[z]] : F(0) = 
0 and F(1) #0}. 
(a) If F,G eS, then FeGe S (closure of $). 
(b) If F € S, there exists a unique G € S with Fe G=2z = Ge F (inverses). 

(Together with 7.59 and 7.63, this proves that (S,e) is a group as defined in 9.1. The 
proof will show that if F,G ¢ S and Ge F =z, then F eG = x automatically follows.) 
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Proof. (a) Suppose F and G belong to S. On one hand, since F'(0) = 0 = G(0), F eG is 
defined and also has zero constant term. On the other hand, 7.60 gives (FeG),; = F,G} 40. 
So FoGesS. 

(b) First we prove that for each F € S, there exists a unique G € S with Ge F = x (we 
call G a “left inverse” of F’). By 7.60, 


(GeF)n = (= cnt”) (n EN). 
m=0 


n 


We can use this equation to give a recursive prescription for the coefficients G,. At the 
start, we must set Go = 0 and G; = 1/F, #0. (Note 1/F, exists because F is a nonzero 
element of the field kK.) Assume n > 1 and Go, Gi,...,Gn—1 have already been determined. 
Since (G,F")n = Gn(F,)”", we need to choose G,, so that 


n n—-1 
0= (= cnt) = GnFP + (= nk) 
m= n m=0 


Evidently there is a unique G,, € K that will work, namely 


n-1 
epee | SiGe |: (7.6) 
By m=0 


n 


n 


Since G € S, we have shown that F has a unique left inverse in S. 
To finish the proof, fix F € S. Let G be the left inverse of Ff, and let H be the left 
inverse of G. Then 


H=Her=He(GeF)=(HeGeF=rx50F =F. 


Since H = F, we see that both Ge F and Fe G = H eG equal the identity element z. 
Thus, G is the two-sided inverse of F’. O 


7.66. Remark. Lagrange’s inversion formula (8.15) provides an alternate way to determine 
the coefficients of the compositional inverse of a formal series F’', which is sometimes easier 
to use than the recursive formula for G,, in the preceding proof. 


7.67. Example. Consider the series E = e? —1 = )0,,,0"/n! and L = log(l1+2) = 
Sao(-b”12"/n. Let us show that L is the two-sided inverse of E relative to formal 
composition. Set H = Le E; since H(0) = 0, it will suffice to prove that H’ = 1 (cf. 7.132). 
First, a routine formal differentiation shows that E’ =e’ = 14+ E and L’=1-—2+4+a?—- 
xg? +++» =(1+2)71. We also have L'e E=1—E+E*-—F%+---=(14+ EE)“ by 7.41. 
The formal chain rule now gives 


H’=(Lek) =(l'eE)E’ = (1+ £) 11+ 2) =1. 
We conclude that Le EF = x, hence also He L =z. 


7.11 Generalized Binomial Expansion 


Given a formal power series F’ € K|[2]], we would like to give meaning to expressions 
like F!/2 = \/F. To prepare for this, we will first define, for each r € K, a power series 
Pow, € K|[a]] that is a formal analogue of the function x + (1+.)". The following example 
from calculus motivates the definition of Pow;. 
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7.68. Example. Consider the real-valued function f(#) = (1+ 2)", where r is a fixed real 
constant, and x ranges over real numbers > —1. Jf f has a Taylor series expansion about 
x = 0, then the coefficients of the power series must be given by Taylor’s formula 


=f O(0 te 
fe)= EO, 
n=0 . 
(cf. 7.55). Computing the successive derivatives of f, we find f’(z) =r(1+ 2)", f" (x) = 
r(r—1)(1+2)"~?, and in general 
FO @) = r(r — I(r — 2) (r—n $1) 42)" = On (L+2)"™ 


(here we use the falling factorial notation from 2.76). Evaluating the derivatives at x = 0, 
we conclude that 


for « close enough to zero, provided f(x) converges to its Taylor series expansion. One can 
prove convergence by bounding the remainder term in Taylor’s theorem. We will omit the 
details since they are not needed in the formal setting considered here. 


Motivated by the formula in the previous example, we define the following series to 
model the function (1+ 2)”. 


7.69. Definition: Falling Factorials and Pow,. For every r € K and every integer 
n > 1, define the falling factorial 


(r)ln=r(r —1)(r—2)---(r-n+1 eK. 


Let (r)|o= 1. Define the formal power series 


Pow; = > (On on € K|[z]]. 


n=0 
(This definition uses the assumption that K is a field containing Q.) 


7.70. Example: Pow, for Integral r. Suppose r €« N C Kk. Let us show that Pow, = 
(1+ a)" (the product of r copies of the series 1+ x) in this case. First note that (r)\n 
reduces to the binomial coefficient (") = aNCony when r is a nonnegative integer (and this 
coefficient is zero for n > r). Next, invoking the binomial theorem 2.14 in the commutative 
ring K'[[x]], we see that 


(l1+2a)" = S- es = Sa (he a = Pow,. 
n=0 n=0 : 


Similarly, if r = —1, the definition of falling factorials shows that (r)|, /n! = (—1)” for all 
n > 0. On the other hand, we have seen that the multiplicative inverse of 1+. in K|[a]] is 


= (vr) 
(+a) *=1-—92+2?—2? fo0= S 0 (-1)"2” - S- ee = Pow_,. 
n=0 ao 
So Pow, = (1+ 2)" is also true when r = —1. We will see in a moment that the same result 


holds for all negative integers r. 
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If x,r, 8 are real numbers, we have the familiar law of exponents: (1+ 2)"** = (1+2)"- 
(1 + 2)*. We now prove the formal analogue of this result. 


7.71. Theorem: Formal Exponent Law for Pow,. For all r,s € K, Powr4+, = 
Pow, Pows. 


Proof. We show Pow,;+s(n) = (Pow, Pows)(n) for each n > 0. On one hand, Pow,+5(n) = 
(r+s)ln /n!. On the other hand, 


(Pow; Pows)(n) = 5 Pow, (k) Pows(n — =e ae 73 ‘ 
k=0 


Comparing these expressions, we see that we must prove the identities 


n 


(+ )ln= Do (E) OMe nae 


k=0 


for all n > 0. We use induction on n. When n = 0, the identity reads 1 = (3) -1-1, which 
is true. Assume the identity holds for some n > 0. Using the recursion (2) = 
we compute 


| 
— 
S 
+ 
Ww 
ae 
— 
3 
— 
3 
+ 
va) 
| 
3 
eae 


(r a 8)bn41 


k=0 
n+1 n 
= S- he (rl; (s 8) qa = j - i (s)ln41— —j 
ja j=0 \J 
n+1 1 
= SP Obs Olney 
j=0 \ J 


In the next-to-last step, we changed summation variables to 7 = k+1 in the first sum, and 
j =k in the second sum. The reader should check that the last equality is valid even for 
the extreme terms 7 = 0 and 7 =n +1. This completes the induction argument. O 


7.72. Theorem: Negative Binomial Formula. For every integer r > 0, 
= —1 
ee ee & is )cayre” EK (lal) 


Proof. The first equality follows by iterating 7.71 r times, recalling that (1+2)~! = Pow_}. 
The second equality follows from 7.69 and the identity 


(ple 2 (pera Dier-2) (re 1) 
n) n! 
es (ay Cte ln = ‘a 


Jar. Oo 


n 


We have now shown that Pow, = (1+ 2)" holds for all integers r. So we can introduce 
the following notation for Pow, without risk of ambiguity. 


7.73. Definition: (1+ x)". For any r € K, let (1 +2)” denote the series Pow, € K|[a]]. 
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7.12 Generalized Powers of Formal Series 


We now have the necessary tools to define operations such as VF’, provided the formal 
power series F' satisfies suitable hypotheses. 


7.74. Definition: Generalized Powers. Suppose F € K|[]] has F(0) = 1, andre K. 
Let F” be the composition Pow, e(F' — 1), which is defined since F’ — 1 has zero constant 
term. 


Informally, Pow, e(f'— 1) = (1+ (F'—1))", so this definition is reasonable. Observe that 
F” always has constant term 1. When r = 1/n for n a positive integer, we also write VF 
to denote F!/”. 

Many familiar rules for manipulating powers remain true in the formal setting, but they 
must be reproved formally before they can be used. 


7.75. Theorem: Properties of Formal Powers. Suppose F' € K|[z]] has F(0) = 1. For 
any r,s © K, F’+’ = F". FS. Furthermore, when r is an integer, F” (as defined in 7.74) 
coincides with the customary algebraic definition of F” (namely the product of r copies of 
F for r > 0, or |r| copies of 1/F for r < 0). 


Proof. Recall from 7.71 that Pow,,; = Pow, Pows. Using the fact that composing on the 
right by F' — 1 is a ring homomorphism (see 7.62(b)), we obtain 

F'ts = Pow,+,e(F — 1) = (Pow; Pow,) e (F — 1) 
(Pow,.e(F' — 1))(Pow, e(F — 1)) = F"F®. 


For the rest of this proof, n will denote an integer and F” will have the usual algebraic 
meaning (repeated multiplication). We must prove Pow, e(F'—1) = F” for alln € Z. When 
n = 0, Powo e(F—-1) = 1e(F—-1) =1 = F®. When n = 1, Pow, e(F—1) = (1l+2z)e(F-1) = 
1+ (F-1)=F = F'. By induction, assuming n > 1 and Pow, e(F — 1) = F”, the result 
just proved shows that 


Pown+1 ¢(F — 1) = (Pow, e(F — 1))(Pow, e(F — 1)) = F°F! = F"*1, 
(The last step uses the algebraic definition of the power F”*+.) Similarly, the known identity 
(Pow_, e(F' — 1))(Pow, e(F — 1)) = Powoe(F— 1) =1 


shows that Pow_, e(F' — 1) is the multiplicative inverse of Pow, e(F — 1) = F”. In other 
words, Pow_,e(F —1)=(F")-1=F™. O 


7.76. Example: Negative Binomial Expansion. Suppose F = 1—cx wherec€ K isa 
constant. Let r be a positive integer. Using 7.72 and the definition of composition, we find 


that 
Piece 1 rs S (nt+r—-l\ nin 
(1 — ca) -(=) = (a. (7.7) 


n=0 


This identity is used often when computing with generating functions. 
Next we prove a partial version of another familiar law of exponents. 


7.77. Theorem: Iterated Exponents. For all F' € K|[x]], all r € K and all integers n, 
(Fr)" — Bre, 
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Proof. The idea is to iterate the known identity F"+S = F’ F’, which holds for all r,s € K. 
First, we prove the result for integers n > 0 by induction. When n = 0, (F")° = 1 = FT. 
When n = 1, (F")! = F" = F"!. Assuming n > 1 and (F")” = F°” is already known, we 
calculate 
(Bert = 9 cam es Oc — pep — preter _ pratt), 

Next, (F")—"(FT)" = (FT)-"*” = 1, and hence (F")—” = ((F7)")—! = (F"")—1. Similarly, 
F- 7 Frn — Porntrn = 1, go that (F™)-! = F-T. So finally (F")-" = F°C™), which 
establishes the result for negative integers. O 


The next result is the analogue of the fact that every positive real number has a unique 
positive nth root, for each n > 0. 


7.78. Theorem: Existence and Uniqueness of nth Roots. Suppose F € K|[z]] 
satisfies F'(0) = 1. For every integer n > 1, there exists a unique G € K|[a]] with G(0) = 1 
such that G" = F, namely G = F!/" = YF. 


Proof. Existence of G follows from the previous result, since (F1/")" = FO/)-" — Fl = FP, 
To prove uniqueness, suppose G, H € K|[2]] satisfy G’ = F = H” and G(0) = 1= H(0). 
By the distributive law, we have the factorization 


0=G"-H"= (G-—H)\(G 4 Gr-2 FI 4+ Gr—3 Fy? fo. A), 


Since K|[[z]] is an integral domain, this implies that either G— H = 0 or 3") G"-!"'H? = 
0. The first alternative gives G = H, as desired. The second alternative is impossible, since 
the left side has constant term 37") 1"~!~‘1¢ = n > 0 while the right side has constant 
term zero. (This proof uses the assumption that K is a field containing Q.) oO 


7.79. Theorem: Formal Power Rule. Suppose F' € K'|[:]] satisfies F(0) = 1. Forr € K, 
(Bare or 


Proof. Let us first show that Pow). = rPow,_1. The formal derivative of Pow, = 


Masol(r)ln /n!)x” is 
Pow, = > (WFD) OM)ntt on y rr Wn on — pow, 1. 


< (n+ 1)! a n! 
It follows that 
(F")' = [Pow, e(F —1)]' = (Pow). e(F—1)|(F—-1) = [r Pow,_, e(F-1)|F’ =rF"'F’.0 


We now have the necessary machinery to solve certain quadratic equations involving 
formal power series. 


7.80. Example. Suppose F' € QJ[z]] is a formal series such that rF* — F +1 = 0. Let 
us “solve for F’,” determining F,, for all n € N. The given equation immediately implies 
Fo = 1. Multiplying the quadratic equation by 4x gives 4x? F? — 4xF + 4x = 0. Completing 
the square leads to (1 — 2xF)? = 1 — 4a. Since V1 —4z is the unique power series with 
constant term 1 that squares to 1 — 4x, we conclude that 


1-—2¢F = V1 —-42. 
Rearranging, «fF = $(1 — 1-42). Since the power series on the right has zero constant 


term, we may safely write 
1—-vVJ1l—42 


F= 
Qa ; 
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where the notation + ye p> ane” can be regarded as shorthand for the series )7,,.5 @n412”- 
(Note x is not a unit in Ql|[z]], so algebraic division by x is not permitted in this ring, 
although we could allow it by passage to the field of fractions (87.7). What we are really 
doing is cancelling the nonzero element x in the integral domain Q|[z]]; cf. 7.135.) 

We remark that our formula for F' is exactly what we would have obtained by for- 
mally applying the quadratic formula for solving AF? + BF + C = 0, which gives 
F = =8tvB"-44C However, one must take care in blindly applying this formula since 
we cannot divide by arbitrary formal power series, and the sign ambiguity in the square 
root must be resolved somehow. In our example, we are forced to choose the minus sign. 


Now we are ready to find F;, for each n > 0. By 7.74, 


For m > 1, the coefficient of z™ here is 


cayman DOE IEI2 Stems 3)/2) eM eae as ass rae (2m — 3) 
m! m! 
= 2™ (2m — 2)! 
T= gle DA G esa (2m — 2) 
a (2m — 2)! 
- ~ mi(m — 1) 


where the last step follows by using m — 1 powers of 2 in the numerator to divide each of 
the even numbers in the denominator by 2. It now follows that 


(te) 1 hie? oReay 


Finally, 


1-vV1-4 1 2 1 
2 nit 2nt 1 \nt+i,n 


Thus, F = )7,.s9 Cn” is the generating function for the Catalan numbers (see 1.55). 


Calculations such as those in the preceding example occur frequently in the application 
of formal power series to combinatorial problems. 


7.13 Partial Fraction Expansions 


Suppose g is a polynomial in C[] with nonzero constant term, and f is any polynomial in 
C[a]. We have seen (§7.6) that g is a unit in C[[z]], so that we can write f/g = 77°, bn x” for 
suitable complex numbers b,. This section presents a technique for finding explicit expres- 
sions for the coefficients b,,, which is a formal version of the “method of partial fractions” 
from calculus. We will see that this technique can be used to find explicit closed formu- 
las for certain recursively defined sequences. Our starting point is the famous fundamental 
theorem of algebra, which we state here without proof. 


Formal Power Series 271 


7.81. Fundamental Theorem of Algebra. Let p € C[z] be a monic polynomial of degree 
n > 1. There exist pairwise distinct complex numbers 7r1,...,7; (unique up to reordering) 
and unique positive integers n,,...,n, such that 


p=(x@—11)" (a — 12)" +++ (a — rg)” € Cla]. 
The number r; is called a root of p of multiplicity n,. 


The following variant of the fundamental theorem is needed in partial fraction problems 
because of the form of the negative binomial expansion (see 7.76). 


7.82. Theorem: Factorization of Polynomials in C[z]. Let p € C[a] be a polynomial 
of degree n > 1 with p(0) = 1. There exist pairwise distinct, nonzero complex numbers 
T1,---,Tp and positive integers n1,...,m% such that 


p(x) = (1— ry)" (1 — rox)" +++ (1 —rpx)”® € Ca]. 


Proof. Consider the polynomial gq = «”P,(1/a). We have p = >°/_ypia* and q = 
es Pn-ix’, so that q is obtained from p by “reversing the coefficient sequence.” Since 
po = 1, q is a monic polynomial of degree n. Using the fundamental theorem of algebra, we 


write 
k 


"P,(1/z)=¢= [[@-ri)™, 


i=l 


where )>n; =n. Since the constant term of q is nonzero, no 1; is equal to zero. Reversing 
the coefficient sequence again, it follows that 


k k Ni k 
P, (1/2) =2"|I( (1/x) — rj) eae A Pc. (—*) =|[G-rx)™. oO 
i=1 = ee 


The next step is to rewrite a general fraction f/g as a sum of fractions whose denom- 
inators have the form (1 — rz). Note that, as long as g(0) # 0, we can always arrange 
g(0) = 1 by multiplying numerator and denominator by a suitable scalar in K. 


3 
| 


7.83. Theorem: Splitting a Denominator. Suppose f,g € C[z] are polynomials such 
that g(0) = 1, and let g have factorization g(x) = ig eae —rj,x)™, where r1,...,7r~, € C are 
distinct and nonzero. There exist polynomials po, pi,...,pr with deg(p;) <n; (or p; = 0) 
for 1 <i<k, such that 


| Di 


Proof. For 1 <i < k, define a polynomial hj = g/(1— riz)™ = []j.;4;(1 — ryx)". Since 


r1,-.-,Tp are distinct, gcd(hi,..., hx) = 1. By a well-known result from polynomial algebra, 
it follows that there exist polynomials qi,...,q% € C[2] with qihit+---+qnhy = 1. Therefore, 
fo _ fl _ fait: + fake 
g g 


og 
k 
= Do Gea = 


i= 


an 


This is almost the answer we want, but the degrees of the numerators may be too high. Using 
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polynomial division (see 5.87), we can write fq; = a;(1 — rjxz)™ + p; where a;,p; € Cla], 
and either p; = 0 or deg(p;) < n;. Dividing by (1 — r;x)"*, we see that 


— Di 
9 =po+ >> (= ria) 


i=1 
holds if we take po = oy. a; € C[a]. Oo 


The fractions p;/(1 — r;7)"' (with deg(p;) <n; or pj = 0) can be further reduced into 
sums of fractions where the numerators are complex constants. 


7.84. Theorem: Division by (1 — rz)”. Given a fraction p/(1 — rx)" where p € C[a], 
deg(p) < n (or p= 0), and0#,r €C, there exist complex numbers ay1,...,@, such that 


P = aj 
(l—ra)r 2, (l—ra) 
Proof. Consider the evaluation homomorphism FE : C{a#] — C[z] such that E(#) = 1—- rex 
(see 7.22). The evaluation homomorphism E” : C[a] — C[z] such that E’(x) = (1—2)/risa 
two-sided inverse to F (since E(E’(x)) =a = E’(E(a))), so E is a bijection. In particular, 
E is surjective, so p = E(q) for some g € C[x]. Now, one may check that F and E’ each map 
polynomials of degree < n to polynomials of degree < n, and it follows that deg(q) <n (or 
q = 0). Write g=co + civ + cox? +++++n-12"—1, with c; € C. Then 
p= E(q)=cot+tei(1—raz) +e2(1 —ra)? +---+en-i1(1—ra)". 
Dividing by (1 — rz)", we see that we may take a1 = Cn—1,..-,Qn—1 = C1, Qn = Co. O 


The next result summarizes the partial fraction manipulations in the last two theorems. 
The uniqueness proof given below also provides a convenient algorithm for finding the 
coefficients in the partial fraction decomposition. 


7.85. Theorem: Partial Fraction Decompositions in C(x). Suppose f,g € C[z] are 
polynomials with g(0) = 1; let g = in bare! —r,x)" where the r; are distinct nonzero 
complex numbers. There exist a unique polynomial h € C[a] and unique complex numbers 
aij (where 1 <i<k,1<j <n) with 


ko ni or: 
St Ee ae (7.8) 


i=1 j=1 


[Ss 


Viewing f/g € C[[a]], we have (for all m € N) 


Be m +3 —1 m 
(f/9)m hm a ya) 
Proof. Existence of the decomposition follows by combining 7.83 and 7.84. The formula 
for the coefficient of «™” follows from the negative binomial expansion 7.76. We must now 
prove uniqueness of h and the a;;’s. Note first that the numbers r; and n; appearing in the 
factorization of g are unique (this follows from the uniqueness assertion in the fundamental 
theorem of algebra). Now consider any expression of the form (7.8). Multiplying both sides 
by g produces an equation 


kon; 
f=gh+ s: Ss aij(1 — ria) [[c —17.2)", (7.9) 


i=1 j=1 s#i 
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where both sides are polynomials. Furthermore, the terms in the double sum add up to a 
polynomial that is either zero or has degree less than deg(g). Thus h must be the quotient 
when f is divided by g using the polynomial division algorithm, and this quotient is known 
to be unique. Next, we show how to recover the “top coefficients” a;,n, for 1 <i<k. Fix 1, 
and apply the functions associated to the polynomials on each side of (7.9) to z = 1/r; € C. 
Since any positive power of (1 — r;x) becomes zero for this choice of «, all but one term on 
the right side becomes zero. We are left with 


Ps(1/ri) = Gn, [[a —rs/ri)”*. 


sAt 


Since r,; #7; for s #7, the product is nonzero. Thus there is a unique a;,,, € C for which 
this equation holds. We can use the displayed formula to calculate each a;,,, given f and g. 

To find the remaining a;;’s, subtract the recovered summands a,,,,,/(1—riv)™ from both 
sides of (7.8) (thus replacing f/g by a new fraction f;/g1) to obtain a new problem in which 
all n;’s have been reduced by one. We now repeat the procedure of the previous paragraph 
to find a;n,-1 for all 7. Continuing similarly, we eventually recover all the a;;. O 


7.86. Example. Let us find the partial fraction expansion of 


f_ x2 —2 
g 1—-2e—224 223° 
To find the required factorization of the denominator, we first reverse the coefficient sequence 


to obtain 23 — 2a? — x +2. This polynomial factors as (x — 2)(x — 1)(a +1), so the original 
denominator can be rewritten as 


1 — 2a — 2? + 22° = (1— 22)(1 — x)(1 +2) 
(see the proof of 7.82). We know that 


xv? —2 A B C 
CO ————" wl 
1 — 2x — x? + 223 [=o Toe Lee (7.10) 


for suitable complex constants A, B,C. To find A, multiply both sides by 1 — 2x to get 


x2 —2 _ Bil-2x) C(1- 2x) 
@=nden 1l-¢ ui l+¢a 


Now set 7 = 1/2 to see that A = (—7/4)/(3/4) = —7/3. Similarly, 


x? —2 


cs eeale cs) (ey) 
x? —2 
C= Tama) 


It now follows from (7.10) that 


xr? —2 He te © AL 
Ee) ee he 
= 2 Se? a) 3 2 


7.87. Example. We will find the partial fraction expansion of 


fF . 1 
g  1—9a+ 302? — 4623 + 3324 — 925° 


-(-1)" (nN). 


ond ae 
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Factoring the numerator as in the last example, we find that g(x) = (1 — x)3(1— 32)?. We 
can therefore write 


f A B C D E 


(7.11) 


g (-2)3 (a02 fae" G30? P38 


To find A, multiply both sides by (1— x)? and then substitute x = 1 to get A = 1/(—2)? = 
1/4. Similarly, multiplication by (1-3)? reveals that D = 1/(2/3)? = 27/8. Having found A 
and D, we subtract A/(1—~)? and D/(1—32)? from both sides of (7.11). After simplifying, 


we are left with 
(3/8)(3a—-7) — B C E 
(G—aG—32) (er foe 13 
Now we repeat the process. Multiplying by (1 — x)? and setting = 1 shows that B = 3/4. 
Similarly, E = —81/16. Subtracting these terms from both sides leaves (27/16) /(1 — x), so 
C = 27/16. Using (7.11) and (7.76), we conclude that 


Ll/n+2 3/n+1 27) 27 (n+ 1\,n 8lon 
F/9n=3( : )+3( )+Z+2( )3 —St5> (nen). 


DS 


7.14 Application to Recursions 


In Chapter 2, we saw that many enumeration problems in combinatorics lead naturally to 
recursion relations. Formal power series and partial fraction expansions provide a powerful 
method for solving a wide class of recursions. Before stating the general method, we consider 
some typical examples. 


7.88. Example. In 2.22, we found that the number a,, of subsets of an n-element set 
satisfies the following recursion and initial condition: 


An = 2an—1 (n > 1); ag = 1. 


It is not hard to guess that the solution to this recursion is a, = 2” for all n > 0, and 
it is then routine to prove that this guess is correct by induction on n. However, for more 
complicated recursions, one is unlikely to find the solution by guessing. Thus, let us see how 
to solve the recursion using formal power series. 

We introduce the formal series F = }°..) Gna” whose coefficients are given by the un- 
known sequence (a,,). Notice that «F = sae Amx™t! = > 5, Gn—12”. By the recursive 
description of the a,’s, we see that (F — 2xF), = 0 for all n > 1. On the other hand, 
(F — 2aF')y = Fo = ao = 1 by the initial condition. It follows that 


F-wF=1€C[z]]. 


Solving for F’, we find that 


1 - NwnN 
ear x”. 


Comparing coefficients of x” leads to the expected solution a, = 2”. 

Now let us modify the problem by changing the initial condition to ag = 3. The same 
reasoning as above leads to F' = 3/(1 — 2x), so that the new solution is a, = 3-2”. 

For a more subtle modification, let us change the recursion to a, = 2an-1 + 1, with 
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initial condition ag = 0. (This recursion describes the number of moves needed to solve 
the famous “Tower of Hanoi” puzzle.) Define F = 5°.) anx" as before. We now have 
(F — 2@F)n = Gn — 2an_-1 = 1 for all n > 1, and (F — 2xF’)g = ap = 0. We conclude that 


OP opty ee ee te ee 


1-2 
Solving for F and using partial fractions, we get 


x 1 ay 


Cee oe Gaon Tae 


Extracting the coefficient of x” yields the solution a, = 2” — 1" = 2" —1 for alln > 0. 


7.89. Example: Fibonacci Recursion. The Fibonacci numbers are defined by the re- 
cursion fr, = fn—1 + fn—2 for all n > 2, with initial conditions fo = 0, fi = 1. (Sometimes 
other initial conditions are used, such as fo = fi = 1, which leads to a shift in the indexing 
of the sequence.) Let us use formal power series to find an explicit closed formula for the 
numbers f,. Define F = 0,39 fnt”. Since (F)n = fn—i and (2?F)n = fn—2 for all n > 2, 
the recursion gives 

(F-aF—-2°F),=0 (n> 2). 


On the other hand, the initial conditions show that 


(P-aF-2°P)p=fo=0;  (P-2P-2°F) = fi- fol. 


It follows that F —2F —a2?F =04+ 1274+ 02? +--+ =~2. Solving for F gives 
2 x 
- 1-2-2? 


We now apply the method of partial fractions. First, reversing the coefficient sequence in 
the denominator gives the polynomial 2? — 2—1 = (a—1r1)(x—r2), where (by the quadratic 


formula) 
ge tye - a. bes 
3 Rees 5) 5 Qo 5) ‘ 
It follows that 1— 2 — a2? = (1—r12)(1 — rex). Next write 
x A B 


F SS SYa]})yy-_ .e!"!"|"*=EFE SSS 0 ——_- 

(l-rma)(l—-rer) l-ne a 1—rex 

Multiplying both sides by (1 — 712) and setting « = 1/r,, we find that A = (1/r1)/(1 — 
ro/r1) = 1/(r1 — ra) = 1/V5. Similarly we find that B = —1/V/5, so that 


Fe x _ 1 1 i 
~ (l-raz)\(l—-rez) Y5\l-rne l—rea/)” 


Extracting the coefficient of «”, we conclude that the Fibonacci numbers are given by the 
following exact formula: 


fn = (7? = 19) /V5 = — [1 + V5)" - 0 V5)"]_— (v2 0). 


a 


Note that |r| + 0.618 < 1, so that limp. r} = 0. It follows that, for very large n, 
fn &% 7" /V5 & (0.447214) - (1.61803)”. 


This formula tells us the asymptotic growth rate of the Fibonacci numbers. 
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The next theorem gives a general method for solving recursion relations with constant 
coefficients. 


7.90. Theorem: Recursions with Constant Coefficients. Suppose we are given the 
following data: a positive integer k, constants c,,C2,...,Ck,do,---,dp—-1 € K, and a function 
g:N—K. The recursion 


An, = Cj On—1 + C9dn—9 +++ + ChOn—~ + 9(n) (n > k) 


with initial conditions a; = d; for 0 < i < k has a unique solution. Setting d, = d; — 
k-1 

c1dj_1 — cedj_2g — +++ — do, F = Sb ana”, G = S75 Ga" + isk g(n)a”, and p = 

1— cx — cox” — --- —cyx*, we have F = G/p. 


Proof. The existence and uniqueness of the sequence (a, : n > 0) satisfying the given 
recursion and initial conditions is intuitively plausible and can be informally established by 
an induction argument. (A formal proof requires the recursion theorem from set theory; see 
Section 12 of Halmos [66] for a discussion of this theorem.) It follows that the formal series 
F € K|[z]] in the theorem statement is well defined. Consider next the formal series 


H=(1—ca—cp27—---- cyx*)F = pF. 


For each n > k, the recursion shows that 
Ay, = An — C1Qn—1 — +++ — ChAn—k = g(n) = Gy. 
On the other hand, for 0 < n < k, the initial conditions show that 
Ay, = d= Gn. 


So H = G, and the formula for F' follows by dividing the equation G = pF’ by the invertible 
element p. oO 


7.91. Example. Let us solve the recursion 
An = 5An_—1 — 6ayn_2 + 2” (n > 2) 


subject to the initial conditions ap = 0, a; = 1. Use the theorem with k = 2, cy = 5, co = —6, 
do = 0, dy = 1, and g(n) = 2”. We find that dj = 0, di =1,G =0+1¢+ nde 2 = 
(1 —22)-1-1-—2, p=1-—52+ 62? = (1 — 2r)(1 — 32), and finally 


pe ede) alae _ x + 2x? 
~~ (1=—22)(1-32)  (1—2e)2(1-— 32) 


A tedious but routine partial fraction computation gives us 


F = —2(1 — 22)~? — 3(1 — 22)! +. 5(1 — 32)"?. 


1 
ay, = = -2("T Jor 3-2" 45-3" = 5-3" — Qn 45) 2", 


Once this formula has been found, one may check the answer by proving by induction that 
it satisfies the initial condition and recursion. 
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7.15 Formal Exponentiation and Formal Logarithms 


In 7.67, we introduced formal series 


B=) _2"/nl=e*-1, L= 5) (-1)""12"/n = log(1 +2), 


n>1 n>1 


and showed that Le EF = x = Ee L. We can use these series and formal composition to 
define the exponential and logarithm of certain power series. 


7.92. Definition: exp(G) and log(1+ G). Suppose G is a formal power series with 
constant term zero. We define 


e? = exp(G) = (EeG)+1= > G" /n!; log(1+G)=LeG= Sy (-1)"1@"/n. 


n=0 
Similarly, if H is a formal power series with H(0) = 1, define 
log H = log(1 + [H —1])=Le(H-1). 
The combinatorial significance of exponentiating a formal power series will be revealed 
in the next chapter (see 8.32). For the moment, we will be content to prove some properties 
of the ordinary exponential and logarithm functions that are also satisfied by their formal 


counterparts. 


7.93. Theorem: Sum-to-Product Rule for Exponentials. For all G, H € K[[2]] with 
G(0) = 0 = H(0), we have 


exp(G + H) = exp(G) exp(H). 


More generally, given a sequence Gy, € K|[a]] with G,(0) = 0 for all & and limp. Gp = 0, 


exp (>. «s) = II exp(Gx). 
k=1 k=1 


Proof. To prove the first identity, look at the coefficient of x” on each side. For the left side, 
the binomial theorem gives 


lee) k n k 

exp(G+H), = a) (EG) 
wo RN Gi Gi Hi 

PP) car eg Ee 


(i,j) EN: itj<n 


n 


Gi(m) H'(n-—m 
led, em 


! 
(i,j)EN?: it j<n m=0 J: 
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On the right side, we get 


[exp(G@) exp(A)|n = S© exp(G@)m exp(H)n—m 
m=0 
. Gi(m H'(n—m 
- > 3 oy a ) 


The two answers almost agree, but the ranges of summation for 7 and j do not quite match. 
However, consider a triple (7, 7,m) in the last summation for which i+ 7 > n. This forces 
either j > m or i > n—™m, so either G?(m) = 0 or H*(n—m) = 0. In any case, the summand 
indexed by this triple (i, 7,m) is zero. Dropping these summands, we get precisely the sum 
occurring in the earlier calculation. 


Iteration of the result just proved shows that exp 5 Gr) a i ees exp(G;,) for any 


(finite) N € N. To prove the same formula with N = oo, we check the coefficient of 
x™ on each side. One sees immediately from the definition that ord(F) > M implies 
ord(exp(F’) — 1) > M. Choose ko large enough that ord(G;) > M or G, = 0 for all 
k > ko. Taking F = >7,..,,, Gk, we then have ord(F’) > M or F = 0. Write exp(F’) = 1+ H 


where ord(H) > M or H = 0. Using the result for finite sums gives 


= (de) S28 (Ser) 


k=1 k=1 


ko ko 
=  |exp(F) II exp(G,)} = Th exp(Gx) 
k=1 M k=1 M 
Now, for any k; > ko, 
ko ky 
II exp(Gr) = TL exp(Gx) 
k=1 M k=1 M 


since, for kg < k < ky, exp(G,) is 1 plus terms of order larger than M. We conclude finally 


that 
exp (>: os) = Tl exp(Gz) 
k=1 M k=1 M 


for every M > 0. O 


7.94. Theorem: Exponential and Logarithm are Inverses. If H € K|[z]] satisfies 
#H(0) = 1, then exp(log(H)) = H. If G € K|[z]] satisfies G(0) = 0, then log(exp(G)) = G. 


Proof. Recall from 7.67 that Ee L =x = Le E. We can therefore compute 


exp(log(H)) = (Be(Le(H—1)))+1=((FeL)e(H—1))4+1 
= («e(H—-1))+1=H-14+1=4; 
log(exp(G)) = Le(((BeG]+1)-1)=Le(Ee@) 


I 


(LeBF)eG=xeG=G. O 
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7.95. Theorem: Logarithm of a Product. For all G, H € K|[z]] with G(0) = 1 = H(0), 
log(GH) = log(G) + log(#). 


More generally, given a sequence G; € K[{2]] with G;,(0) = 1 for all & and lim, .., Gy = 1, 


log (II or] = S- log(G;). 
k=1 k=1 


Proof. Since G and H have constant term 1, we know that 
GH = exp(log G) exp(log H) = exp|(log G) + (log H)]. 
Since GH has constant term 1, we can take logarithms to conclude that 
log(GH) = log(G) + log(#), 


as desired. The formula for converting infinite products to infinite sums is proved similarly. 
O 


Formal exponentials and logarithms obey formal differentiation rules entirely analogous 
to those learned in calculus. 


7.96. Theorem: Derivative Rules for Exponentials and Logarithms. If G € K|[z]] 
satisfies G(0) = 0, then (exp(G))’ = G’expG. If H € K'[a]] satisfies H(0) = 1, then 
(log(H))! = H'/H. 


Proof. A direct calculation using the definition shows that E’ = E+1. Applying the formal 
chain rule, we conclude that 


(exp(G) = [(BeG) +1! =(F' ea" 
((E+1)eG)G’ = ((EeG)4+1)G' =G' exp(G). 
Use this result to differentiate the identity exp(log(H)) = H. We obtain 
(log(H))’ exp(log(H)) = H’, 


or equivalently (log(H))’/H = H’. Since H(0) = 1, we can divide by H to conclude that 
(log(H))’ = H’/H. Oo 
7.97. Theorem: Power Rule for Logarithms. If H € K|[:]] satisfies H(0) = 1 and 
r € K, then log(H") = rlog(H). 


Proof. On one hand, both log(H”) and rlog(H) have formal derivative equal to rH'/H 
(by 7.79, 7.75, 7.96, and the chain rule). On the other hand, both log(H”) and rlog(#) 
have zero constant term. Thus these two series must be equal (see 7.132). O 


DT 


7.16 Multivariable Polynomials and Formal Series 


So far we have discussed polynomials and formal power series involving a single indetermi- 
nate. One can generalize this setup to arrive at the notions of multivariable polynomials 
and series. 
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7.98. Definition: Formal Multivariable Power Series and Polynomials. A formal 
power series in k variables with coefficients in K is a function F : N* — K. The set of 
all such series is denoted K[[x1,22,...,2,]]. A series f € K[[a1,...,x%]] is a polynomial iff 
{ie N* : f(%) £0} is finite. The set of all such polynomials is denoted K[21,..., xx]. 


The power series notation for a function F : N* > K is 


— M1 7,22 Nk, 
— ) Fig sive Mpe ey hyo es 


the function value F(n1,..., x) is called the coefficient of x}! --- a," 


iff only a finite number of its coefficients are nonzero. 


in F. F isa polynomial 


7.99. Definition: Algebraic Operations on Multivariable Series. Given c € K and 
series F,G € K|[v1,...,x]], the sum F'+G is defined by (F + G)(7) = F(7%) + G(a) for all 
ni € N*. The scalar multiple cF is defined by (cF)(%) = c(F(7)) for all 7 € N*. The product 
FG is defined by 


(FQ) = SY) FOGG) (eEN*). 


7.100. Example. For 1 < i < k, let x; : N* — K be the polynomial defined by sending 
(0,0,...,1,...,0) (the 1 occurs in position 7) to 1 and sending everything else to zero. One 
can check that cx}! ---a;" is the series that sends (m1,...,nx) to c and everything else to 
zero. This justifies the notation used above for elements F € K[[x1,...,xx%]], at least when 


F is a polynomial. 


The following theorem is proved by making the necessary adjustments to the proofs 
given in the one-variable case. Even more general results are sketched in the exercises. 


7.101. Theorem: Algebraic Structure of Multivariable Series and Polynomials. 


Kl[xi,...,2%]] and K[a1,..., 2%] are commutative rings that are integral domains, as well 
as vector spaces over K containing a copy of K. The set {x7!---a2* : (ni,...,nx) € N*} 
is a basis for the vector space K[x1,..., 2x]. 


Multivariable polynomial rings satisfy the following universal mapping property that 
generalizes 7.22. The proof is also left as an exercise. 


7.102. Theorem: Evaluation Homomorphisms for Multivariable Polynomials. 


Let S be a commutative ring containing K. (a) For each f € K[a1,...,2x%], there is an 
associated function Py : S* + S' given by 
Prat s2e)= DD flay. esma)aft---2k* (we 8). 


(b) For each k-tuple 7 = (2,...,z%) € S*, there exists a unique ring homomorphism 
E: K{a,...,a%] — S such that E(c) = c for all c € K and E(a;) = % for alli < k; 
namely, E(f) = Pr(Z) for f € K[a1,...,2%]. We write E = evz and call it the evaluation 
homomorphism determined by setting each x; equal to z;. 


7.103. Definition: Formal Partial Derivatives. For 1 < i < k, define a map D; : 
K|le1,...,2%]] + K[ea,...,2x)] by 


DE) Se = Ss (ng +1)F(mi,...,n¢4+1,...,mg)apt s+ apts aye. 


(n1,N2,...,.M~)ENF 


D, is called the formal partial derivative operator with respect to x;. 
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It is routine to check that the analogues of the one-variable differentiation rules (§7.8) 
extend to the partial derivative operators D;. There are also formal versions of the multi- 
variable chain rule. We now prove one such rule for multivariable polynomials, which will 
be used in §10.16. 


7.104. Theorem: Multivariable Chain Rule. Let h © K[y1,...,yn] and g1,.--,9n € 
Kla1,...,Um]. Let he g € K[x1,...,2%m] denote the polynomial obtained by setting each 
yi =g inh. For l<k<m, 


j=l 
Informally, we may write 

O(heg) Oh Og Oh OGn 

Or, = Oy. OnE Oyn OxpK’ 


with all of the partial derivatives Oh/Oy; being evaluated at (y1,...,Yn) = (91,--+;9n): 


Proof. Both sides of the claimed identity are K-linear functions of h. So it suffices to check 
the identity when h has the form y;'---y&". In this case, he g = gi'---g&". Viewing this 
as a product of e; + ---+ en factors, each equal to some g;, the multivariable product rule 
leads to 


e ej—1 en 
Dr(heg) = S_ ejgft +++ 95? (De (gs) ++ 98" 
j=l 


On the other side, (Djh) eg = e;g;' -- gf --- ge, Multiplying by D;,(g;) and summing 


over j gives the same answer as before, so the proof is complete. O 


Summary 


Table 7.1 reviews the definitions of concepts and operations involving formal power series 
and polynomials. Table 7.2 lists some rules and formulas arising in computations with formal 
power series (some hypotheses on constant terms are omitted in this table). Let us also recall 
the following results. 


e Algebraic Structure of K|[x]] and K[a]. Both K[[z]] and K[z] are commutative rings, 
integral domains, and vector spaces over K containing a copy of kK. The same holds for 
K|[x1,..-,z]] and K[x1,..., 2]. The set {x’ : i > 0} is a basis for K [x], whereas the 
set {2['--- a2" : (m,...,n%) € N*} is a basis for K[x1,..., 2x). 


Degree and Order. For polynomials f,g € K[z], deg(f +g) < max(deg(f), deg(g)) and 
deg(fg) = deg(f) + deg(g) whenever both sides are defined. For series F,G € K[x]], 
we have ord(F' + G) > min(ord(F’), ord(G)) and ord(F'G) = ord(F) + ord(G) whenever 
both sides are defined. 


Multiplicative Inverses in K[x] and K|[a]]. A polynomial f is invertible (a unit) in 
K [a] iff deg(f) = 0. A series F is invertible in K|[:]] iff F(0) A 0. In this case, one 
can use the formal geometric series to invert F or the recursive formula (F~'), = 
—(1/Fo) yy Fe(F~")n—«. A nonzero formal quotient F/G € K((x)) can be written 
in a unique way as a Laurent series )>~_,,, an” where m € Z, each ay € K, and a», £ 0; 
m = ord(F’) — ord(G) is the order of this Laurent series. 
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TABLE 7.1 
Definitions concerning formal power series and polynomials. 


Brief Definition 
function fF: N — Kk, denoted >)" 9 F,2” 
formal series f with {n : fy 4 0} finite 
set with addition and multiplication satisfying axioms in 2.2 
nonzero commutative ring with no zero divisors 

nonzero commutative ring with all nonzero elements invertible 
set of formal power series with coefficients in kK 


formal power series 
formal polynomial 
ring 

integral domain 
field 

K[[2]] 


[x] set of formal polynomials with coefficients in K 
K(x) {f/9: f,9 € Kz], 9g 4 0} =field of fractions of K [a] 
K((x)) {F/G: F,Geé K|[a]],G 4 0} =field of fractions of K[[z]], 


or the set of formal Laurent series >, an2” (m € Z, ay € K) 
Gy = (0).1,0,0,..) Sy, a= Da" 
F=G iff F, =G, for alln Ee N 

(F + G)(n) = F(n) + G(n) for alln EN 

(FG)(n) = a jenz: it jen P(OG(J) for n € N (convolution) 
F'(n) = (dF /da), = (n+1)F(n +1) forneN 

F®)(n) = (d* F/dx*)y, = (n+1)--:(n+k)F(n+k) (n€N) 
Fi, — Liff¥m € N,AN ae Yn EN, (n> N => F,(m) = L(m)) 
ew 9 fa = limy es sre 9 Fn (if limit exists) 

Ll Fy, = limy-s00 eae n (if limit exists) 

PeG= ye ofnG” (need F € Kx] or G(0) = 0) 

e = De a /nl € K [fal] 

sin = (0, 1,0, -1/3!,0, 1/5!,0,...) € K[[a]] 

cos x = (1,0, —1/2!,0, 1/4!,0,—1/6!,...) € K[[z]] 

log(1 +2) = 2 ,(—1)""t2"/n € K [la] 

Pow, = (14+ 2)" = 5% of(rln /nla" € Kile] (7 € K) 

PP * (rn (nf — 1)" (need F(0) = 1) 

exp(G) = 7.) G"/n! (aged G(0) = 0) 

log(1 + G) 2 (-1)"1G"/n (need G(0) = 0) 

or log(F) = 3°, (-1)" "1 (F — 1)"/n (need F(0) = 1) 
deg(f) = max{n: fn #0} for nonzero f € K [a] 

ord(F’) = min{n: F,, 4 0} for nonzero F € K|[2]] 
for f € K[x], Pp: S > S sends z€ SD K to 5 ee) gh 
map between rings preserving +, x, and 1 

ev.(f) = Py(z) for fe K[a],z€S 5 K 

set of k-variable formal series F : N‘ = K 

set of k-variable polynomials 

DiF(ni,...,2%) = (m4 + 1)F(n1,...,m) +1,..., 0) 


the series x 
equality of series 
sum of series 
product of series 
derivative of series 
kth derivative 
formal limit 


infinite sum of series 
infinite product 
formal composition 
formal e*” 

formal sin x 

formal cos x 

formal log(1 + 2x) 
formal (1+ x)" 
formal power 

formal exponential 
formal logarithm 


degree 
order 
polynomial function 
ring homomorphism 
evaluation hom. 
Kl[a1,..., 2x] 
Klai,..., 2k] 
partial derivative 
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TABLE 7.2 
Rules for calculating with formal power series. 


Product of k series: (Fi, +--+ Fy)(n) = Scere ee Fy (i1) +--+ Fe (te). 
Positive powers: G"(n) => es nm JG) + G(n)P, 


geeey 


summed over (ko,...,kn) with }>; ki =m and S>, ky = n. 


Geometric series: (1-G)7' = 3.) G” when G(0) = 0. 
Negative powers: Gir a Ce) G” when m € N* and G(0) = 0. 
Limit rules: F, — P and G, — Q imply 


Fx +G,—7P+Q, FrGn—- PQ, Fr,eG,- PeQ. 


Derivative rules: (F+GY=F'+G, (FG)! =(F)G+F(G’), (2*)! =krk, 


(FeG) =(FeG)G’, (Fi =rFr1F’, 

(exp(G))' = G'exp(G) [G(0) = 0], (og(H))’ = H’/H [H(0) = 1, 
(Hy) = (H ), Olan) = Donat Hea 

F = Sp o(F™ (0)/kl)2*; Da(he g) = 32, ((Djh) © g) Di (93). 


Laws of exponents: Pow,;, = Pow, Pows; F"t’ = F’F’; 


(F°)" = FT" for r,s © K,n € Z. 


Exp and Log: exp(G + H) = exp(G) exp(H); exp (oy Gy) = ie exp(Gx); 


log(GH) = log(G) + log(H); log (Tx Gz) = 1 log (Ge); 
log(exp(G)) = G [G(0) = 0]; exp(log(H)) = H [H(0) = 1); 
log(H") = rlog(H) [H(0) = 1). 


Compositional Inverses in K|[x]]. Formal composition is associative and has x as a two- 
sided identity. For fixed G, the map F »% F'eG is a ring homomorphism fixing Kk. 
A series F' € K|[a]] has an inverse G relative to formal composition if F'(0) = 0 and 
F (1) 40. The set of all such series is closed under composition and forms a group under 
this operation. The inverse G of F may be found by Lagrange inversion (see 8.15) or by 
the recursive formula Gp, = —(1/FP)(0° 5 GmF™)n- 


Evaluation Homomorphisms. Let S$ be a commutative ring containing K and z1,..., 2% € 
S. There exists a unique ring homomorphism E£: K[21,...,2%] > S such that E(x;) = 
z for 1 <i<kand E(c) =c forallce K. 


Density of Polynomials in K|[a]]. For each F € K[a]], F = limy. ae Fx", so 
that any formal power series is a limit of formal polynomials. 


Existence Criteria for Formal Limits. For nonzero series F;, € K|{2]]: 

F, > 0 in K|[a]] iff ord(f,) — oo in R; 

reo Fe exists in K[[2]] iff ord(F,) — oo in R; 

if Fy(0) = 0 for all k, []72.(1 + Fe) exists in K'[[x]] iff ord(F,) > oo in R. 


Polynomial Factorization in C{x] and Partial Fractions. A monic polynomial p € C[a] 
factors uniquely as (w—71)™'---(a—r,)"* with r; € C. If instead p(0) = 1, we can write 
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p=(1l—s,2)™---(1—s,x)"* with s; € C. Given g € C[z], there is a unique expression 


ko on; 
q aig 
a Doe ee 


i=1 j=1 


where fA is the remainder when g is divided by p; each aj, is found by multiplying 
all terms by (1 — s;x)"' and setting « = 1/s,;; and the remaining a;,;’s are found by 
subtracting the previously recovered terms and iterating. 


e Recursions with Constant Coefficients. Suppose F,, = ee CGFy—; + Hy for all n > k, 
where c;,...,c, € K and H € K|[a]] are given. F' is uniquely determined by the k initial 
values Fo,..., F,—1-. The series F has the form G/p, where p = 1—c,x2 —---—c,x* and 
G is a series with G,, = H,, for n> k. 


Exercises 


7.105. Let f = 2-274 3x4 and g = 1- 22+ 32+. Compute f +g, fg, and the degrees 
and orders of f, g, f +g, and fg. 


7.106. Let F = (1,0,1,0,1,0,...) and G = }7,,) nz”. Compute F + G, FG, F(1+ 2), 
F(1— 2”), G(1 +2), F’, G’, and the orders of all these series. 


7.107. Let f =2?+42—1 and g=2x* +2. Compute P;(2), P,(vW5), Pr(x), Pr(g), Po(f), 
and Py(f). 


7.108. Compute the coefficient of x” for 0 < n < 6 for each of the following formal series: 
(a) e* +sinz; (b) e* sina; (c) (cosx) log(1 + x); (d) (log(1 + 2))?. 


7.109. (a) Find necessary and sufficient conditions for strict inequality to hold in the 
formula deg(f +g) < max(deg(f), deg(g)). (b) Find necessary and sufficient conditions for 
strict inequality to hold in the formula ord(F' + G) > min(ord(F’), ord(G)). 


7.110. Use (7.3) in the proof of 7.40 to find the first five terms in the multiplicative inverse 
of each of the following series: (a) e®; (b) 1 — 2a + 2? + 324; (c) 1+ log(1 +2). 


7.111. Use 7.41 to find the first five terms in (1 — 2+ 2)7?. 
7.112. Compute the multiplicative inverse of $7°°_, 7a" in K((2)). 


7.113. Convert the following expressions to formal Laurent series: (a) (a? + 3)/(a? — 27); 
(b) a/(x* — 5a? + 62). 


7.114. Formal Hyperbolic Sine and Cosine Functions. Define formal series sinh x = 
(e” —e—*)/2 and cosha = (e” + e~”)/2. (a) Find (sinh x), and (cosh x), for all n € N. (b) 
Show (sinh a)’ = cosh and (coshz)’ = sinha. 


7.115. Complete the proof of 7.8(a) by verifying the remaining ring axioms for K'|[a]]. 
Indicate which of the ring axioms for K are used in each part of your proof. 


7.116. Let R be any ring. Verify that the sum and product operations in 7.6 (with K 
replaced by R) make R[x] and R[[zx]] rings, which are commutative if R is commutative. 
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7.117. This exercise shows that the characterization of units in [a] given in 7.38 can fail 
if K is not a field. (a) Give an example of a commutative ring R and f € R{a] such that 
deg(f) = 0 but f is not a unit of R[z]. (b) Give an example of a commutative ring R and 
f € R[z] such that deg(f) > 0, yet f is a unit of R[z]. (c) Show that for any n € N*, there 
exists f as in part (b) with n nonzero coefficients. 


7.118. Continuity of Exp and Log. (a) Assume F,(0) = G(0) = 0 and F, — G in 
K|[a]]. Prove exp(F.,.) — exp(G). (b) Assume F,(0) = G(0) = 1 and F, — G. Prove 
log(Fi.) — log(G). 


7.119. Prove the following general version of the universal mapping property for polynomial 
rings. Let f : L — R be a given ring homomorphism between two commutative rings. Given 
(21,...,2%) € R*, there exists a unique ring homomorphism EF : L[{x,,...,2%] —- R such 
that E(x;) = z; for all i and E(c) = f(c) for all c € L. Point out any steps in your proof 
that require the assumption that the rings are commutative. 


7.120. (a) Show that, because K is an infinite field, the map 7 : K[z] — “K, given by 
m(f) = Pr for f € K[2], is injective. (b) Give an example of a commutative ring R such 
that the map 7 : R[x] — ®R is not injective. 


7.121. Prove 7.20(a),(c). 
7.122. Let F.,Gx, P,Q € K|[ax]] satisfy F;, > P and G;, — Q. Prove Fh, + Gy — P+Q. 
7.123. Prove 7.54(a),(b),(c),(g). 


7.124. Complete the following outline to give a new proof of the formal product rule 
(FG) = (F’)G+ F(G’) for F,G € K|[z]]. (a) Show that the result holds when F = 2’ and 
G = 2’, for all i,j € N. (b) Deduce from (a) that the result holds for all F,G € K[z]. (c) 
Use a continuity argument to obtain the result for all F,G € K|[a]]. 


7.125. Prove 7.62(b),(c). 


7.126. Use a continuity argument to deduce the formal chain rule for formal power series 
(see 7.64) from the chain rule for polynomials. 


7.127. Prove the following formal derivative identities: (a) (sinx)’ = cosa; (b) (cosz)! = 
—sinz; (c) flog(1 +2)! = (1+2)7!. 


7.128. Formal Quotient Rule. Suppose F,G € K|[[z]] where G(0) 4 0. Prove the deriva- 
tive rule (F/G)! = (GF’ — FG’)/G?. 


7.129. Formal Integrals. The formal integral or antiderivative of a series F € K|[2]] is 


the series F 
Fde =). 44" 
[Para See lel, 
n>1 
which has constant term zero. Compute the formal integrals of the following formal power 
series: (a) 3+ 2x — 7a? + 122°; (b) Di ns9272"; (c) Dnso(m + 1!a"; (d) e*; (e) sina; (f) 


cosa; (g) (l+2)~*; (h) 35a. 


7.130. Prove the following facts about formal integrals (as defined in 7.129). 

(a) (sum rule) fF + Gdx = f Fdx+ f Gdz for F,G € K[[a]]. 

(b) (scalar rule) [f cF'. dx =c f F dx for ce K and F € K|[[z]}. 

(c) (linear combination rule) f 07, Hi dx = 0, ci: f Hi dx for c; € K and H; € K[[z]]. 
Can you formulate a similar statement for infinite sums? 
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(d) (power rule) fa* dx = ~ya**? for all k > 0. 

(e) (general antiderivatives) For all F,G € K|[z]], G’ = F iff there exists c € K with 
G=fFdrt+-e. 

(f) (formal fundamental theorems of calculus) F = 4 f Fdz and f F' dx = F — F(0) for 
(g) (continuity of integration) If F,,H € K[[x]] and Fy — H, then f Fi, dx — f H dz. 


7.131. Formulate and prove an “integration by parts” rule and a “substitution rule” for 
formal integrals (as defined in 7.129). 


7.132. (a) Given F,G € K|[z]], prove that F = G iff F’ = G’ and F'(0) = G(0). (b) State 
and prove an analogous statement for multivariable series. 


7.133. (a) Prove that (sin x)? + (cosa)? = 1 in K[[z]] by computing the coefficient of x” on 
each side. (b) Prove that (sin)? + (cosx)? = 1 in K/[a]] by invoking 7.132 and derivative 
rules. 


7.134. Prove that (cosh)? — (sinha)? = 1 in K[[z]]. 


7.135. Cancellation in an Integral Domain. Let R be a nonzero commutative ring. 
Prove that R is an integral domain iff the following cancellation axiom holds: for all a, b,c € 
R, ab=ac anda 0 imply b=c. 


7.136. Product Rule for Multiple Factors. Let F,,..., fF), € K[[a]]. Prove that 
d li d 
ree: Fe) = 0 Fi: Fj(s Fj) Fist oy, 
j=l 
Does a version of this rule hold for infinite products? 


7.137. Use 7.11 to prove the differentiation rule 4(G™) = mG™~'G! for G € K[[z]] and 
m € N* without using the formal chain rule. 


7.138. For m,n € N*, evaluate the sum 
m! 
> Kgl0lFo h/t «fe, inlFn 
(ko,k1,--.,kn )EN"*?: 
7.139. Let F =[]72,(1—2*). Find F, for 0 <n < 22. Can you see a pattern? 


7.140. Carefully justify the following calculation: 


][@-27"')' =[[a-<«”) [Ja-2)7 = ]Ja+-2*). 
n=1 i=1 j=l k=1 


In particular, explain why all the infinite products appearing here exist. 


7.141. Find a necessary and sufficient condition on series F}, € K|[x]] so that the infinite 
product [[72.,(1 + Fy)7? exists. 


7.142. Evaluate []>-_)(1 +27"). 


7.143. Verify the partial fraction expansion of F’ given in 7.91. 
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7.144. Write out the formal series for each of the following expressions: (a) (1 — x)~°; (b) 


V14+4; (c) 1/V1 — «7; (d) VI + 32. 
7.145. Compute the first four nonzero terms in the following series: (a) V1 + # + 32; (b) 
Veos 2; (c) (Upso(n + 1)24%) 3/2, 


7.146. Compute the first four nonzero terms in: (a) exp(sin x); (b) log(cos x). 


7.147. Find the partial fraction decomposition of F = (10 + 2x%)/(1 — 2x — 8x”), and use 
this to determine F'(n) for all n. 


7.148. Find the partial fraction decomposition of F = (1 — 7x)/(15a? — 8a + 1), and use 
this to determine F'(n) for all n. 


7.149. Find the partial fraction decomposition of F = (2x3 — 4x? — x — 3)/(2x? — 4a + 2), 
and use this to determine F'(n) for all n. 


7.150. Find the partial fraction decomposition of F = (152° + 302° — 15x24 — 35x47 — 15a? 
12x — 8)/(15(a4* + 2x — 2x — 1)), and use this to determine F(n) for all n. 


7.151. (a) Solve the recursion an = 3an—1 (n > 1), given that ap = 2. (b) Solve the recursion 
Gn = 3an—1+3n (n > 1), given that ao = 2. (c) Solve the recursion a, = 3an-1+3” (n > 1), 
given that ago = 2. 


7.152. Solve the recursion an, = 64n—1 — 8adn—2 + g(n) (for n > 2) with initial conditions 
ao = 0, a, = 2 for the following choices of g(n): (a) g(n) = 0; (b) g(n) = 1; (c) g(n) = 2”; 
(d) g(n) =n”. 


7.153. The Lucas numbers are defined by setting Lp = 1, Ly = 3, and Ly = Lyn_-1 + In_2 
for n > 2. Use formal series to find a closed formula for Ly. 


7.154. Solve the recursion a, = —3dn—1 + 2an—2 + 6an—3 — An—4 — 3G4n—5 (for n > 5) with 
initial conditions a, = k forO0<k< 5. 


7.155. Repeat 7.154 with initial conditions a, = 3 forO<k <5. 
7.156. Suppose bp = 1 and b, = bp + 6, +--+ +bn—-1 +1 for all n > 1. Find endo bynx”. 


7.157. Suppose (c, : n € Z) satisfies co = 0, cr = 1, and cy = (Cn-1 + Cn41)/L for all 
n € Z, where L € R* is a constant. Find an explicit formula for cp. 


7.158. Differentiation of Laurent Series. Define a version of the formal derivative oper- 
ator for the ring K((a)) of formal Laurent series. Extend the derivative rules (in particular, 
the quotient rule) to this ring. 


7.159. Formal Tangent and Secant Functions. Define formal series sec x = 1/ cos x and 
tanx = sina/ cosa. (a) Compute (seca), and (tanz), for 0 <n < 9. (A combinatorial in- 
terpretation of these coefficients is described in §12.8.) (b) Show that (tan xz)? +1 = (sec x)?. 
(c) Show that (tanz)! = (seca)? and (secz)’ = tanazsecz. (d) Show that (tanz), = 0 for 
all even n and (seca)n = 0 for all odd n. (e) Can you give similar definitions and results 
for cot x and csc x? 


7.160. Substitution of rz for x. Given F' € K|[x]] and nonzero r € K, define F(rz) 
to be the formal composition F' e (rx). Prove: (a) sin(2x) = 2sinacosz; (b) cos(2x) = 
(cos x)? — (sinx)?; (c) exp(ra) = exp(x)" for nonzero r € K; (d) exp(ix) = cosx +isina, 
cos x = (exp(ix) + exp(—izx))/2, and sinx = (exp(ix) — exp(—iz))/2i (assuming i = /—1 € 
kK). 
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7.161. Even and Odd Formal Series. A series F € K|[z]] is even iff F(—x) = F 
(see 7.160); F is odd iff F(—x) = —F. (a) Show that F is even iff F,, = 0 for all odd n, 
and F is odd iff F,, = 0 for all even n. (b) Which formal trigonometric and hyperbolic 
trigonometric series are odd? Which are even? (c) Give rules for determining the parity 
(even or odd) of F + G, FG, and (when defined) F~', given the parity of F and G. 


7.162. Let R be a commutative ring. Recall that « € Ris a unit of R iff there exists y € R 
with ry = yx =1R; x is nilpotent iff there exists n € Nt with 2” = Or. (a) Suppose x € R 
is arbitrary and z € R is nilpotent. Prove xz is nilpotent. (b) Suppose x € R is a unit of R 
and y € R is nilpotent. Prove x + y is a unit of R. (c) Suppose z,y € R are both nilpotent. 
Prove 7+y is nilpotent. (d) Which results in (a), (b), and (c) hold if R is a non-commutative 
ring? 


7.163. Let R be a nonzero commutative ring. Prove that f € R{z] is a unit of R[z] iff fo 
is a unit of R and f, is nilpotent in R for all n > 0. 


7.164. Use (7.6) in the proof of 7.65 to find the first several coefficients in the compositional 
inverses of each of the following series: (a) sina; (b) tana; (c) #/(1— 2). 


7.165. Use the formal Maclaurin formula 7.55 to deduce the series expansions of e”, sin x, 
cosx, and (1—rzx)~+ starting from the rules for differentiating these formal series. 


7.166. Taylor’s formula states that (under suitable hypotheses on f : R — R) f(x) = 
bees LO (a —a)” for all « sufficiently close to a. Give two reasons why this formula is 


not meaningful (as written) for formal power series, when a € K is nonzero. 


7.167. (a) Show that 57,59 2°"/(3n)! = (1/3)e” + (2/3) cos(2/3/2)e—*/? in C[[zx]]. (b) Try 
to find similar formulas for J)... 2°71" /(3n + 1)! and S759 2°"*?/(3n + 2)!. 


7.168. Exponential Generating Functions. Given a sequence F = (fF, : n> 0 
K|[x]], the exponential generating function of this sequence is F* = )°.4(Fn/n! 
K|[z]]. Prove that, for all F,G € K[[z]]: (a) (F + G)* = F* + G*; (b) nl(F*G"*), 
De=o (f) FeGn—ei (0) H(F*) = ((F — F(0))/2)". 


7.169. Sum of Squares via Formal Series. The goal of this problem is to use series to 
derive a formula for )>/_9 k? without guessing the answer in advance. (a) Express the series 
59772" as a linear combination of the series (1 — x)~', (1— x)~?, and (1 — x)~*. (b) 
Perform a suitable operation on the series in (a) to obtain an algebraic formula for the series 
noo ( reo k?)x”. (c) Extract the coefficient of x” in (b) to obtain a formula for }77_9 k? 
that is a polynomial of degree 3 in n. (d) Explain another way to solve this problem based 
on 7.90. 


3 
Ina 


7.170. Use the method of 7.169 to evaluate the following sums for all n: (a) )>;4 k; (b) 
Doe=o K; (c) Dopo 3. 
7.171. Prove that for all k,n € Nt, 


57 Meth ItY) 


1F 42% +...+n* Erm (—1)**4(n +1)f 541 - 


7.172. State and prove a version of the quadratic formula for solving AF? + BF +C =0, 
where A, B,C € K|[z]] are known series and F' € K|[z]] is unknown. What hypotheses must 
you impose on A, B, C? Is the solution F’ unique? 
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7.173. Recursion for Divide-and-Conquer Algorithms. Many algorithms use a 
“divide-and-conquer” approach in which a problem of size n is divided into a subproblems 
of size n/b, and the solutions to these subproblems are then combined in time cn” to give 
the solution to the original problem. Letting T(n) be the time needed to solve a problem of 
size n, T(n) will satisfy the recursion T(n) = aT (n/b) + en* and initial condition T(1) = d 
(where a,b,c,d > 0 and k > 0 are given constants). Assume for simplicity that n ranges 
over powers of b. (a) Find a recursion and initial condition satisfied by S(m) = T(b™), 
where m ranges over N. (b) Use formal series to solve the recursion in (a). Deduce that, for 
a suitable constant C' and large enough n, 


Cnk if a < b* (combining time dominates); 
T(n) << Cn®logyn if a = b* (dividing and combining times balance); 
Clee 4 if a > b* (time to solve subproblems dominates). 
7.174. Merge Sort. Suppose we wish to sort a given sequence of integers 71,...,2%p, into 


increasing order. Consider the following recursive method: if n = 1, the sequence is already 
sorted. For n > 1, divide the list into two halves, sort each half recursively, and merge the 
resulting sorted lists. Let T(n) be the time needed to sort n objects using this algorithm. 
Find a recursion satisfied by T(n), and use 7.173 to show that T(n) < Cnlog, n for some 
constant C’. (You may assume n ranges over powers of 2.) 


7.175. Fast Binary Multiplication. (a) Given « = ak +b and y = ck +d (where 
a,b,c,d,k € N), verify that xy = (ak + b)(ck + d) = ack? + bd+ ((a + b)(c +d) — ac— bd)k. 
Take & = 2” in this identity to show that one can multiply two 2n-bit numbers by recursively 
computing three products of n-bit numbers and doing several binary additions. (b) Find a 
recursion describing the number of bit operations needed to multiply two n-bit numbers by 
the recursive method suggested in (a). (c) Solve the recursion in (b) to determine the time 
complexity of this recursive algorithm (you may assume n is a power of 2). 


7.176. Formal Linear Ordinary Differential Equations. Suppose P,Q € K|[z]] are 
given formal series, and we wish to find a formal series F’ € K|[z]] satisfying the “linear 
ODE” F’ + PF = Q and initial condition F(0) = c € K. Solve this ODE by multiplying by 
the “integrating factor” exp({ P dx) and using the product rule to simplify the left side. 


7.177. Formal ODEs with Constant Coefficients. For fixed c),...,c, € C, let V be 
the set of all formal series F' € C[[:]] satisfying the ODE 


F®) 4 ¢ FOD 4 eg FG?) 4... 4F = 0. (7.12) 


The characteristic polynomial for this ODE is q = a® + cea"! + eqn 2 +--- +e, € Cla]. 
Suppose q factors as (2 —11)*! ---(a—r,)** for certain k; > 0 and distinct r; € C. (a) Show 
that the k series x exp(rjx) (for 1 <i < s and 0 < j < k;) lie in V. (b) Show that V isa 
complex vector space, and the & series in (a) form a basis for V. (c) Describe a procedure 
for expressing a given sequence F' € V as a linear combination of the sequences in the basis 
from part (a), given the “initial conditions” F(0), F’(0),...,F~(0). (d) Let W be the 
set of formal series G € C[[z]] satisfying the non-homogeneous ODE 


GO ge Ged tee ee =H, 


where H € C|[2]] is a given series . If G* is one particular series in W, show that W = 
{F+G*: FeV}. 


7.178. Characteristic Polynomial of a Recursion. This problem sets up an anal- 
ogy between recursions with constant coefficients and ordinary differential equations with 
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constant coefficients. For fixed c),...,c, € C, let V be the set of all formal series 
(A, :n > 0) € C[[2]] satisfying the recursion 


Ay = C1 An—-1 + coAn—2 +++: + cKrAn—k (n > k). (7.13) 


The characteristic polynomial for this recursion is q = a® —cya*-1 —egr*-2 -..--cy Cla]. 
Suppose q factors as (2 —11)*!---(~—r,)** for certain k; > 0 and distinct r; € C. (a) Show 
that the k sequences (nr? :n > 0) (for 1 <i<s and 0 < j < k;) lie in V. (b) Show that 
V is a complex vector space, and the k sequences in (a) form a basis for V. (c) Describe a 
procedure for expressing a given sequence A € V as a linear combination of the sequences 
in the basis from part (a). (Use the “initial conditions” Ao,...,A,—1.) (d) Let W be the 
set of sequences (B,, : n > 0) satisfying 


By = cy Bn-1 + c2Bn_2 a eae CeBn—k a g(n) (n = k), 


where g(n) is a given function. If B* is one particular sequence in W, show that W = 
{A+ B*: AETV}. 


7.179. Fill in the details of the construction of the field of fractions of an integral domain, 
which was sketched in 7.44. Specifically, show that: (a) the relation ~ on X is reflexive, 
symmetric, and transitive; (b) addition and multiplication on F are well defined; (c) F, 
with these operations, is a field; (d) the map 7: D > F is an injective ring homomorphism; 
(e) F satisfies the universal mapping property stated in 7.44. 


7.180. Localization. The construction of fields of fractions can be generalized as follows. 
Let R be a commutative ring (not necessarily an integral domain), and let S C R be a 
subset such that 1 € S and zy € S whenever x,y € S. Our goal is to use R to construct a 
new ring in which every element of S becomes a unit. 

Define an equivalence relation on X = Rx S by setting (a,s) ~ (b,t) iff there exists 
u € S with u(at — bs) = 0. (a) Show that ~ is an equivalence relation on X; let T be 
the set of equivalence classes. (b) Define addition and multiplication operations on T; show 
that these operations are well defined and make T into a commutative ring. (c) Define 
i: R—T by letting i(r) be the equivalence class of (r,1) for r € R. Show that 7 is a ring 
homomorphism (which may not be injective, however) such that 7(s) is a unit in T for every 
s € S. (d) Show T has the following universal property: if U is any commutative ring and 
j: RU any ring homomorphism such that j(s) is a unit in U for every s € S, then there 
exists a unique ring homomorphism f : 7’ — U such that 7 = f 07%. 


i FE 


Notes 


A detailed but rather technical treatment of the algebraic theory of polynomials and formal 
power series is given in Bourbaki [19, Ch. IV]. More facts concerning polynomials, integral 
domains, and related aspects of ring theory may be found in algebra texts [8, 70, 71]. 
Discussions of formal power series from a more combinatorial perspective may be found in 
Stanley [127, Ch. 1] and Wilf [139]. 
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The Combinatorics of Formal Power Series 


Now that we have the technical machinery of formal power series at our disposal, we can 
resume our combinatorial agenda of studying infinite weighted sets. We will develop versions 
of the weighted sum and product rules in this setting. We will also explore the combinatorial 
significance of other operations on formal series, like composition and exponentiation. These 
techniques will be used to obtain deeper combinatorial information about objects studied 
earlier in the book, including trees, integer partitions, and set partitions. 


8.1 Sum Rule for Infinite Weighted Sets 


8.1. Definition: Admissible Weighted Sets and Generating Functions. Suppose 
S is a set with weight function wt : S — N. The weighted set (5, wt) is called admissible 
iff for every n > 0, the set S, = {z € S: wt(z) = n} is finite. In this case, the generating 
function of the weighted set (5, wt) is the formal power series 


Co 


Gs = Gs = Y~ |Snla” € Ql[al]. 


n=0 
Informally, this series represents 7.42”. 


8.2. Example. Every finite weighted set S is admissible. Furthermore, the generating 
function for such a set is a polynomial in x, since |S,| = 0 for all large enough n. We 
studied generating functions of this type in Chapter 6. 


8.3. Example. Let S be the set of all binary trees (with any number of vertices), and let 
wt(T) be the number of vertices in T’ for each tree T € S. Then S,, consists of all binary 
trees with n vertices. Even without determining the precise cardinality of S,, (which we did 
in 2.36), one can check that S;,, is finite for each n > 0. Thus, S is an admissible weighted 
set. In most applications, we will not have calculated the cardinality |S,,| in advance — the 
whole point of using generating functions is to help solve problems like this! But in most 
situations of interest, it will follow routinely from the nature of the objects and weights 
that S,, is finite for all n. So we will often omit explicit proofs that the weighted sets under 
consideration are indeed admissible. 


8.4. Example. Let (S,wt) be an admissible weighted set. Let T be any subset of S with 
the same weight function as S. Then T,, C S,, for all n, so that (T, wt) is also admissible. 
Similarly, a finite disjoint union of admissible weighted sets is again admissible. 


8.5. Theorem: Weight-Preserving Bijection Rule. Suppose (5S, wt;) and (T, wt2) are 
two weighted sets such that there exists a weight-preserving bijection f : S > T (ie., 
wto(f(s)) = wti(s) for all s € S). Then S is admissible iff T is admissible, and Gg = Gr. 
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Proof. Because f (and hence f~+) are weight-preserving, f restricts to bijections f,, : S, > 
T, for each n > 0. So |S,,| = |T;,| for all n > 0, which implies the desired conclusions. O 


At last we are ready for the most general version of the sum rule. 


8.6. Sum Rule for Infinite Weighted Sets. Suppose (5, wt) is an admissible weighted 
set that is the disjoint union of subsets {T; : 1 € I}, where the index set J is finite or equal 
to N. Assume that for all i € I and all x € T;, wtr,(x) = wtg(x). Then 


Gs =)0Gr,. 


ier 


Proof. Let us compute the coefficient of x” on each side, for fixed n > 0. Write S,, (resp. 
(Ti)n) for the set of objects in S (resp. T;) of weight n. By assumption, S;, is a finite set 
which is the disjoint union of the (necessarily finite) sets (T;)n. Let I, be the set of indices 
such that (Tj), is nonempty; then J,, must be finite, since S$, is finite. By the ordinary sum 


rule for finite sets, 
[Sn] = S> \(Tinl- 
i€In 


The left side is the coefficient of x” in Gg, while the right side is evidently the coefficient of x” 
in )ie, Gr,, since the summands corresponding to i ¢ J, contribute zero to the coefficient 
of «”. When J =N, this argument also proves the convergence of the infinite sum of formal 


power series }),., Gr,, since the coefficient of x” stabilizes once i > max{j:j€In}. O 


8.2 Product Rule for Infinite Weighted Sets 


In this section we prove two versions of the product rule for generating functions. The first 
version is designed for situations in which we build weighted objects by making a finite 
sequence of choices, as in Chapter 1. The second version extends this rule to certain infinite 
choice sequences, which leads to formulas involving infinite products of formal power series. 


8.7. Informal Product Rule for Infinite Weighted Sets. Suppose (S, wt) is a weighted 
set; k is a fixed, finite positive integer; and (Tj, wt;) are admissible weighted sets for 1 < 
i < k. Suppose each z € S can be uniquely constructed by choosing z; € 71, then z2 € To, 
..., then zz € Tz, and then assembling these choices in some manner. Further suppose that 


wt(z) = 5/ wt(zi) (8.1) 


for all z € S. Then (S, wt) is admissible, and 


k 
Gs =|[Gx. 
i=1 


Proof. Recast in formal terms, our hypothesis is that there is a weight-preserving bijection 
from (S,wt) to the weighted set (T,wt) where T = T; x --- x T, and wt(z,...,2%) = 
cy wt(z;). So it suffices to replace S by the Cartesian product set T. Furthermore, it 
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suffices to prove the result when k = 2, since the general case follows by induction as in 1.5. 
Fix n > 0; we are reduced to proving 


GT, xT» (n) = (Gr, Gr, ) (7). 


The left side is the cardinality of the set A = {(t1,t2) € Ty x To : wti (ti) + wte(t2) = n}. 
Now A is the disjoint union of the sets (T), x (T2)n—z%, where (Ti), (resp. (T2)n—«) is the 
finite set of objects in T; (resp. T2) of weight k (resp. n — k), and k ranges from 0 to n. So 
A is a finite set (proving admissibility of T, x T>), and the ordinary sum and product rules 
for finite unweighted sets give 


| A] = ye |(Z2)n—K| = Sen -Gr,(n— k). 


This sum is precisely the coefficient of 2” in Gr, Gr,, so we are done. O 


The following technical device will allow us to obtain generating functions for objects 
that are built by making an infinite sequence of choices. 


8.8. Definition: Restricted Cartesian Product. Suppose {(T,,wtn) :n > 1} isa 
countable collection of admissible weighted sets such that every T, contains exactly one 
element of weight zero; call this element 1,. Let T = | eeee be the set of all infinite 
sequences (t, : 2 > 1) such that t, € T, for all n and ty = 1, for all but finitely many 
indices n. We make T into a (not necessarily admissible) weighted set by defining wt((tn : 

n> 1)) =)0,51 wt(tn); this sum is defined since all but finitely many summands are zero. 


8.9. Product Rule for the Restricted Cartesian Product. Let {(Tn,wtn) :n > 1} 
and T = [[*., Tn be as in 8.8. If ord(Gr, — 1) > 00 as n — 00, then (T, wt) is admissible 


and es 
Gr = || Gr,. 
n=1 


Proof. The condition on the orders of Gr, — 1 ensures that the infinite product of formal 
series is defined (see 7.33(c)). Fix m > 0; let us compute Gr(m). Choose N so that n > N 
implies ord(Gr, — 1) > m. Consider an object t = (tn :n > 1) in T. If th A 1, for some 
n > N, then wt(t) > wtn(tn) > m, so this object does not contribute to the coefficient 
Gr(m). So we need only consider objects where t, = 1, for all n > N. Dropping all 
coordinates after position N gives a weight-preserving bijection between this set of objects 
and the weighted set TJ, x --- x T,. We already know that the generating function for this 


weighted set is peer Gr,,. So 
N oo 
n=1 m n=1 m 


the last equality holds since ord(Gr,, — 1) > m for n > N. This argument has shown that 
Gr(m) is finite for each m, which is equivalent to admissibility of T. oO 


To apply this result, we start with some weighted set (S,wt) and describe an “infinite 
choice sequence” for building objects in S by choosing “building blocks” from the sets Ty. 
Each set T,, has a “dummy object” of weight zero. Any particular choice sequence must 
eventually terminate by choosing the dummy object for all sufficiently large n, but there 
is no fixed bound on the number of “non-dummy” choices we might make. This informal 
choice procedure amounts to giving a weight-preserving bijection from S' to the restricted 
product [> 5, Tn. We can then conclude that G's = [],,5, Gr,, provided that the infinite 
product on the right side converges. ~ 
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8.3. Generating Functions for Trees 


This section illustrates the sum and product rules for infinite weighted sets by deriving the 
generating functions for various classes of trees. 


8.10. Example: Binary Trees. Let S be the set of all binary trees, weighted by the 
number of vertices. By definition (see 2.36), every tree t € S is either empty or is an 
ordered triple (e,¢;,t2), where ¢; and tz are binary trees. Let So be the one-element set 
consisting of the empty binary tree, let St = S ~ So be the set of nonempty binary trees, 
and let N = {e} be a one-element set such that wt(e) = 1. By definition of the generating 
function for a weighted set, we have Gs, = x° = 1 and Gy = x! = 2. By the sum rule for 
infinite weighted sets, 
Gs =Gs,+Gg+ =14+ Gor. 


By the recursive definition of nonempty binary trees, we can uniquely construct every tree 
t € S* by: (i) choosing the root node e € N; (ii) choosing the left subtree t; € 9; (iii) 
choosing the right subtree t2 € S; and assembling these choices to form the tree t = (e, t1, t2). 
It follows from the product rule for infinite weighted sets that 


G+ =— GnGsGg = «Ge. 


Writing F to denote the unknown generating function G's, we conclude that F' satisfies the 
equation 
F=14+2F" 


in Q|[z]], which is equivalent to zF? — F +1 = 0. Furthermore, F'(0) = 1 since there is 
exactly one binary tree with zero vertices. In 7.80, we saw that this quadratic equation and 
initial condition has the unique solution 


La WeEE yy Le LN. oe 
ie om a. 


F= 
2 2n+1\n+1,n 


Taking the coefficient of x” gives the number of binary trees with n nodes, which is the 
Catalan number C!,. A more combinatorial approach to this result was given in Chapter 2. 


8.11. Example: Full Binary Trees. A binary tree is called full iff every vertex in the 
tree has either zero or two (nonempty) children. In the context of binary trees, a leaf is a 
vertex with zero children. Let S$ be the set of nonempty full binary trees, weighted by the 
number of leaves. We can write S as the disjoint union of S; = {(¢,0,0)} and Ssy=S ~ Sj. 
We can build an element t of Ss2 by choosing any ti € S as the (nonempty) left subtree 
of the root, and then choosing any t2 € S as the right subtree of the root. Note that 
wt(t) = wt(t1) + wt(t2) since the weight is the number of leaves. So, by the product rule, 
Gs, = G2. We see directly that Gs, = x. The sum rule now gives the relation 


Gy =2£+G%, 


with G's(0) = 0. Solving the quadratic G2 — Gs + x = 0 by calculations analogous to those 


in 7.80, we find that 

1-— V1—-42 

ao 
where F' is the generating function considered in the previous example. It follows that 
Gg(n) = F(n—1) =C,_1 for alln > 1. 


Gs = 
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8.12. Example: Ordered Trees. Let S' be the set of ordered trees, weighted by the number 
of vertices. We recall the recursive definition of ordered trees from 3.79. First, 0 is an ordered 
tree with one vertex. Second, for every integer & > 1, a tuple t = (k, 1, to,..., tx) such that 
each t; € S is an ordered tree, and the number of vertices of t is 1 at yer "_, wt(t ). (Informally, 
t represents a tree whose root has k children, which are ordered from left to right, and where 
each child is itself an ordered tree.) All ordered trees arise by applying the two rules a finite 
number of times. The first rule can be considered a degenerate version of the second rule in 
which k = 0. Let us find the generating function Gg. First, write S as the disjoint union of 
sets {S;, : k > 0} where 5S; consists of all trees t € S such that the root node has & children. 
By the sum rule for infinite weighted sets, 


Gs= 5) Ga 
k=0 


(One can verify the admissibility hypothesis on S by noting that every tree in S;, has & or 
more leaves.) For each k > 0, a direct application of the product rule (with & + 1 choices) 
shows that 
Gg, =2'-Gg-...-Gs = 2G. 
SS 
k 


(The «x arises by choosing the root node from the set {e}.) Substitution into the previous 
formula gives 


xv 
Gs= Seth = 
k=0 


the last step is valid (using 7.41) since Gg(0) = 0. Doing algebra in the ring Q|[z]] leads 
to the relation G2 — Gs + « = 0. This is the same equation that occurred in the previous 
example. So we conclude, as before, that 


a ae ia” 


n>1 


Gs = 


Our generating function calculations have led us to the following (possibly unexpected) 
enumeration result: the set of binary trees with n vertices, the set of full binary trees with 
n+l1 leaves, and the set of ordered trees with n+1 vertices all have cardinality C,,. Now that 
this result is in hand, it is natural to seek a bijective proof in which the three sets of objects 
are linked by explicitly defined bijections. Some methods for building such bijections from 
recursions were studied in Chapter 2. Here we are seeking weight-preserving bijections on 
infinite sets, which can be defined as follows. 

Let S denote the set of all binary trees, and let T’ be the set of all nonempty full 
binary trees. We define a weight-preserving bijection f : S — T recursively by setting 
f(0) = (#,0,0) and 

F((e, ti, t2)) = (¢, f(t), f(t2))- 


See Figure 8.1. To see that the weights work, first note that the zero-vertex tree @ is mapped 
to the one-leaf tree (e,@,). In the recursive formula, suppose t; and tz have a vertices and 
b vertices, respectively. By induction, f(t:) and f(t2) are nonempty full binary trees with 
a+ 1 leaves and b + 1 leaves, respectively. It follows that f sends the tree (e,t1,t2) with 
a+b+ 1 vertices to a full binary tree with (a+ 1) + (b+ 1) = (a+b+1) +1 leaves, as 
desired. The inverse of f has an especially simple pictorial description: just erase all the 
leaves! This works since a nonempty full binary tree always has one more leaf vertex than 
internal (non-leaf) vertex (see 8.50). 
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4 t f(t,) f(ty) 
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FIGURE 8.1 
Bijection between binary trees and full binary trees. 


Now let U be the set of all ordered trees. We define a weight-preserving bijection g : 
T — U. First, g((e,0,0)) = 0. Second, if t = (e,t1,t2) € T with t; and tz nonempty, define 
g(t) by starting with the ordered tree g(t) and appending the ordered tree g(t2) as a new, 
rightmost child of the root node of g(t). See Figure 8.2. More formally, if g(t:) = kur ... ux, 
let g(t) = (k +1) uy ... ux g(t2). As above, one may check that the number of vertices in 
g(t) equals the number of leaves in t, as required. 


8.13. Remark. These examples show that generating functions are a powerful algebraic 
tool for deriving enumeration results. However, once such results are found, it is often 
desirable to find direct combinatorial proofs that do not rely on generating functions. In 
particular, bijective proofs are more informative (and often more elegant) than algebraic 
proofs in the sense that they give us an explicit pairing between the objects in two sets. 


8.4 Compositional Inversion Formulas 


Let F = 30,5, Fnx” be a formal series with F(0) = 0 and F, 4 0. We have seen in 7.65 
that there is a unique series G = )0,., Gna” with G(0) = 0 and G; ¥ 0 such that 
FeG=2=GeF-. Our goal in this section is to find combinatorial and algebraic formulas 
for the coefficients of G. 

Since F, 4 0, we can write F = a/R where R = >°,.) Rnx” is a series with Ro # 0 
(see 7.40). We have FeG = <2 iff G/(ReG) = 2 iff G = «(Re G). It turns out that we 
can solve the equation G = 1(Re G) by taking G to be the generating function for the set 
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FIGURE 8.2 


Bijection between full binary trees and ordered trees. 


of ordered trees (or equivalently, terms), relative to a suitable weight function. This is the 
essence of the following combinatorial formula for G. 


8.14. Theorem: Combinatorial Compositional Inversion. Let F = «/R where R = 
>So An” is a given series in K'[[x]] with Ro 4 0. Let T be the set of all terms (§3.13). 
Let the weight of a term w = wiw2-:-ws € T be wt(w) = Rw, Rw, ++ Rw, x. Then G = 
Gr = ever Wt(w) is the compositional inverse of F. 


Proof. Note first that, for any two words v and w, wt(vw) = wt(v) wt(w). Also G(0) = 0, 
since every term has positive length. Now, by 3.85 we know that for every term w € T, there 
exist a unique integer n > 0 and unique terms tj,...,t, € T such that w = ntyto---tn. 
For fixed n, we build such a term by choosing the symbol n (which has weight «R,,), then 
choosing terms ty € T, to € T, ..., tn € T. By the product rule for generating functions 
(which generalizes to handle the current weights), the generating function for terms starting 
with n is therefore cR,,G”". Now by the sum rule, we conclude that 


G= S- tRnG” =2(ReG). 


n>0 


By the remarks preceding the theorem, this shows that F e G = z, as desired. O 


In 3.91 we gave a formula that counts all terms in a given anagram class R(0*°1*12*2 ---), 
Combining this formula with the previous result, we deduce the following algebraic recipe 
for the coefficients of G. 


8.15. Theorem: Lagrange’s Inversion Formula. Let F = 7/R where R = 0,3) Rnv” 
is a given series in K[[z]] with Ry 4 0. Let G be the compositional inverse of F. Then 
d n—-1 
a R” 
(x) 


G(n) = (R)ai/n == (n>). 


n! 


0) 
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Proof. The second equality follows routinely from the definition of formal differentiation. 
As for the first, let T;, be the set of terms of length n. By 8.14, we know that 


G(n) = S- wt(w). 
weTn 


Let us group together summands on the right side corresponding to terms of length n 
that contain ko zeroes, k, ones, etc., where ),.,)k; = n. Each such term has weight 


a” Rho Rin --+, and the number of such terms is aOR eke me provided that kg = 1+ 
Disi(t — 1)k; (3.91). Summing over all possible choices of the k;, we get 
= 1 a ko pki pka 
as x (rota ko...) Ha ae 
(Ko ,k1,k2,.-.): 


DViso h=n, ko=14+V 51 0-Dki 


Now, in the presence of the condition }7,.,k; = n, the equation kj = 1+ 50,.,(¢- Dk; 
holds iff }7j59(¢— 1)ki = —1 iff )7,.9 iki =n — 1. So 7 


S- 1 n 

G = = Rko RE Re he 

© aie oe) Oe 
(ko, k1,ke,...): 


Lixo kjw=n, iso ikj=n-1 
On the other hand, 7.11 gives 
a houkick 
n = = Ro RE Reo... 
( eae ys (host tos...) Ort tk 
(ko, k1,k2,...): 
Dido ki=N Dido thi=n—1 
The right sides agree, so we are done. = 


8.16. Example. Let us use 8.15 to find the compositional inverse G of the formal series 
F =2x/e*. Here R =e? = >,,,0*/k!, so R® =e" = oy s9(n*/k!)x*. It follows that 


nr-l nr-l 


OU A= Seer cae 


Thus, G = es nla”. 


8.5 Generating Functions for Partitions 


This section uses formal power series to prove some fundamental results involving integer 
partitions. Recall from §2.8 that Par denotes the set of all integer partitions. Our first result 
gives an infinite product formula for the partition generating function. 


8.17. Theorem: Partition Generating Function. 


> = [145 


ucPar i=1 
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Proof. The proof is an application of the infinite product rule 8.9. We build a typical 
partition y € Par by making an infinite sequence of choices, as follows. First, choose how 
many parts of size 1 will occur in yz. The possible choices here are 0,1, 2,3,.... The generating 
function for this choice (relative to area) is 1+a+a?+a°+---=(1—2)7~!. Second, choose 
how many parts of size 2 will occur in yz. Again the possibilities are 0,1,2,3,..., and the 
generating function for this choice is 1 +2? + 2++a°%+--- = (1—2?)7!. Proceed similarly, 
choosing for every 1 > 1 how many parts of size 7 will occur in yp. The generating function 
for choice i is >°°,(z’)* = (1 — 2')~'. Multiplying the generating functions for all the 
choices gives the infinite product in the theorem. 

Here is a more formal rephrasing of the proof just given. For each i > 1, let T; be 
the set of all integer partitions v (including the empty partition of zero) such that every 
part of v is equal to i. As argued above, Gr, = (1 — 2’)~'. Given any partition 1, write 
pe = (192% .-- 4% ---) to indicate that yw has a; parts equal to i for all i. (Note that a; = 0 for 
large enough i.) Then the map (1912 ---i% +--+) ((1%), (2%), +--+, (¢%), +++) is a weight- 
preserving bijection from Par onto [];,, T;. The result now follows directly from 8.9. O 


We can add another variable to the partition generating function to keep track of addi- 
tional information. Recall that, for w € Par, ji is the length of the first (longest) part of yu, 
and é(y) is the number of nonzero parts of pu. 


8.18. Theorem: Enumerating Partitions by Area and Length. 


Ca | 


S> fal“! =] —- S> ale! in Q(é)[[a]]. 


uePar i=1 uc Par 


Proof. To prove the first equality, we modify the preceding argument to take into account 
the t-weight. At stage i, suppose we choose k copies of the part 7 for inclusion in py. This will 
increase ¢() by & and increase |u| by ki. So the generating function for the choice made at 


stage 7 is 
; 1 
tk ki __ tr?\F = a 
PO el es peer 


k>0 k>0 


The result now follows from the product rule, as before. To prove the second equality, observe 
that conjugation is a bijection on Par that preserves area and satisfies (u’)1 = C(). oO 


We can use variations of the preceding arguments to derive generating functions for 
various classes of integer partitions. 


8.19. Theorem: Partitions with Odd Parts. Let OddPar be the set of integer partitions 
all of whose parts are odd. Then 


7 1 
De a Il {= gh 


ue OddPar k=1 


Proof. Repeat the proof of 8.17, but now only make choices for the odd part lengths 1, 3, 5, 7, 
etc. O 


8.20. Theorem: Partitions with Distinct Parts. Let DisPar be the set of integer 
partitions all of whose parts are distinct. Then 


CO 


S> al4l =] +-"). 


uweDisPar i=1 
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Proof. We build a partition ~ € DisPar via the following choice sequence. For each part 
length i > 1, either choose to not use that part in yu or to include that part in uz (note that 
the part is only allowed to occur once). The generating function for this choice is 1 + 2". 
The result now follows from the product rule 8.9. O 


By comparing the generating functions in the last two theorems, we are led to the 
following unexpected result. 


8.21. Theorem: OddPar vs. DisPar. 


a Ser 


ue OddPar vé€DisPar 


Proof. We make the following calculation with formal power series: 


Co 


yo oall = — ih. 
[yoko 
we OddPar k=1 


Co Co 


me Ua ‘ial (1 — 279) 


j=l 


= — [lla-2)0 +2) 


j=l vé€DisPar 


The first and last equalities hold by the two preceding theorems. The penultimate equality 
uses a cancellation of infinite products that was formally justified in 7.42; the same example 
(with x replaced by x7) justifies the second equality. The third and fourth equalities can be 
verified by similar methods; the reader should fill in the details here (cf. 7.140). oO 


i 


8.6 Partition Bijections 


We have just seen that the generating function for partitions into odd parts (relative to 
area) coincides with the generating function for partitions with distinct parts. We gave 
an algebraic proof of this result based on manipulation of infinite products in the formal 
power series ring Q|{1]]. However, from the combinatorial standpoint, it is natural to ask 
for a biective proof of the same result. We therefore seek an area-preserving bijection 
F’ : OddPar — DisPar. Two such bijections are presented in this section. 


8.22. Sylvester’s Bijection. Define F’' : OddPar — DisPar as follows. Given yp € OddPar, 
draw a centered version of the Ferrers diagram of ~ in which the middle boxes of the parts 
of y are all drawn in the same column; see Figure 8.3. Note that each part of does have 
a middle box, because the part is odd. Label the columns in the centered diagram of ju as 
—k,...,—-2,—-1,0,1,2,...,% from left to right, so the center column is column 0. Label the 
rows 1,2,3,... from top to bottom. We define v = F'(1) by dissecting the centered diagram 
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piece 2 SSeS | POPs oho deeeled safe -> piece 1 
piece4 <-/----}--}-15 Sele iee ecpieces 
piece6 <[-->-T5 ---}--}--+> piece 5 
piece 8 <-j}---: -F-4-+}- = piece 7 
piece 10 {21 - +--+ =piece 9 


F( (13,13,11,11,11,7,7,7,7,5,3,3,1,1,1) ) = (21,17,16,13,11,9,8,3,2,1) 


FIGURE 8.3 
Sylvester’s partition bijection. 


of 4s into a sequence of disjoint L-shaped pieces (described below), and letting the parts of v 
be the number of cells in each piece. The first L-shaped piece consists of all cells in column 0 
together with all cells to the right of column 0 in row 1. The second L-shaped piece consists 
of all cells in column —1 together with all cells left of column —1 in row 1. The third piece 
consists of the unused cells in column 1 (so row 1 is excluded) together with all cells right 
of column 1 in row 2. The fourth piece consists of the unused cells in column —2 together 
with all cells left of column —2 in row 2. We proceed similarly, working outwards in both 
directions from the center column, cutting off L-shaped pieces that alternately move up and 
right, then up and left (see Figure 8.3). 

One may check geometrically that the size of each L-shaped piece is strictly less than 
the size of the preceding piece. It follows that F(w) = v = (v. > v2 > ---) is indeed an 
element of DisPar. Furthermore, since || is the sum of the sizes of all the L-shaped pieces, 
the map F : OddPar — DisPar is area-preserving. We must also check that F' is a bijection 
by constructing a map G : DisPar — OddPar that is the two-sided inverse of F’. 

To see how to define G, let us examine more closely the dimensions of the L-shaped pieces 
that appear in the definition of Fj). Note that each L-shaped piece consists of a corner 
square, a “vertical portion” of zero or more squares below the corner, and a “horizontal 
portion” of zero or more squares to the left or right of the corner. Let yo be the number 
of cells in column 0 of the centered diagram of pu (so yo = C()). For all « > 1, let x; 
be the number of cells in the horizontal portion of the (2i — 1)th L-shaped piece for w. 
For all i > 0, let y; be the number of cells in the vertical portion of the 2ith L-shaped 
piece for y. For example, in Figure 8.3 we have (yo, y1, y2,---) = (15, 11,8,6,1,0,0,...) and 
(a1,22,...) = (6,5,3,2,1,0,0,...). Note that for alli > 1, yi-1 > y; whenever y;-1 > 0, 
and x; > xj41 whenever x; > 0. Moreover, by the symmetry of the centered diagram of py 
and the definition of F’', we see that 


Y4y=Yor %1, 2=%17 Yl, 
v3 = Y1t+ 22, V4 = £2 + Ya, 


V5 = Y27 03, Ve = %3 + Y3, 
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and, in general, 
Yyji-1 = Yi-1 + XG (é > a Voy = + Yi (i > 1). (8.2) 


To compute G(v) for vy € DisPar, we need to solve the preceding system of equations for 
x; and y;, given the part lengths v;. Noting that vz, x,, and y, must all be zero for large 
enough indices /, we can solve for each variable by taking the alternating sum of all the 
given equations from some point forward. This forces us to define 
Yi = Voiti — Yri4a + Vai43 — Voita roe: (i > 0); 
Li), 


IV 


Lie = Vai — V2i41 + V2i42 — Vata +-°: (i 


One verifies immediately that these choices of x; and y; do indeed satisfy the equations 
Vy4-1 = YW-1 + % and 14; = x; + y;. Furthermore, because the nonzero parts of v are 
distinct, the required inequalities (y;-1 > y; whenever y;-1 > 0, and x; > x;41 whenever 
x; > 0) also hold. Now that we know the exact shape of each L-shaped piece, we can fit the 
pieces together to recover the centered diagram of 4 = G(v) € OddPar. For example, given 
vy = (9,8,5,3,1,0,0,...), we compute 


yo = 9-8+5-341=4 
wy = 8-54+3-1=5 

yy = 5-3841=3 

wm = 3-1=2 

y = 1. 


Using this data to reconstitute the centered diagram, we find that G(v) = (11,7,5,3). In 
closing, we remark that bijectivity of F is equivalent to the fact that, for each v € DisPar, 
the system of equations (8.2) has exactly one solution for the unknowns 2; and y;. 


8.23. Glaisher’s Bijection. We define a map H : DisPar — OddPar as follows. Each 
integer k > 1 can be written uniquely in the form k = 2°c, where e > 0 and c is odd. 
Given v € DisPar, we replace each part k in v by 2° copies of the part c (where k = 2°c, as 
above). Sorting the resulting odd numbers into decreasing order gives us an element H(v) 
in OddPar such that |H(v)| = |v|. For example, 


H((15,12,10,8,6,3,1)) = sort((15,3,3,3,3,5,5,1,1,1,1,1,1,1,1,3,3,3, 1)) 
(5: 5B, S, 09,8, 008.8, ek Ely 


The inverse map kK : OddPar — DisPar is defined as follows. Consider a partition 
p © OddPar. For each odd number c that appears as a part of py, let n = n(c) > 1 be the 
number of times c occurs in py. We can write n uniquely as a sum of distinct powers of 2 
(this is the base-2 expansion of the integer n, cf. 5.5). Say n = 24% +272 4...424. We 
replace the n copies of c in pz by parts of size 2%c, 2%c, ..., 24c. These parts are distinct 
from one another (since the d,;’s are distinct), and they are also distinct from the parts 
obtained in the same way from other odd values of c appearing as parts of yu. Sorting the 
parts thus gives a partition K (jy) € DisPar. For example, 


K((7, 7, 7, 7, 7, 3, 3, 3, 3, 3, 3, 1, 1, 1)) = sort((28, 7, 12,6, 2,1)) = (28, 12, 7,6, 2,1). 
It is readily verified that H o K and K o H are identity maps. 


Glaisher’s bijection generalizes to prove the following theorem. 
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8.24. Theorem: Glaisher’s Partition Identity. For all d > 2 and N > 0, the number 
of partitions of N where no part repeats d or more times equals the number of partitions 
of N with no part divisible by d. 


Proof. For fixed d, let A be the set of partitions where no part repeats d or more times, 
and let B be the set of partitions with no part divisible by d. It suffices to describe weight- 
preserving maps H: A— Band kK: B— Asuch that HoK and K oF are identity maps. 
We define kK by analogy with what we did above. Fix u € B. For each c that appears as a 
part of yu, let n = n(c) be the number of times this part occurs in yu. Write n in base d as 


n= 5S azd* (0< a, <d), 
k=0 


where n and ag,..., a, all depend on c. To construct K (1), we replace the n copies of c in 
pt by ag copies of d°c, a, copies of d'c, ..., ag copies of d*c, ..., and a, copies of d°c. One 
checks that the resulting partition lies in A, using the fact that no part c of ys is divisible 
by d. 

To compute H(v) for v € A, note that each part m in v can be written uniquely in the 
form m = d*c for some k > 0 and some c = c(m) not divisible by d. Adding up all such 
parts of v that have the same value of c produces an expression of the form >°,59 adc, 
where 0 < ax < d by definition of A. To get H(v), we replace all these parts by 7,5, and” 
copies of the part c, for every possible ¢ not divisible by d. Comparing the descriptions of 
Hand K, one sees that these two maps are inverses. O 


8.25. Remark: Rogers-Ramanujan Identities. A huge number of partition identities 
have been discovered, which are similar in character to the one we just proved. Two espe- 
cially famous examples are the Rogers-Ramanujan identities. The first such identity says 
that, for all N, the number of partitions of N into parts congruent to 1 or 4 modulo 5 
equals the number of partitions of N into distinct parts 11 > v2 > --- > ve > 0 such 
that vy; — 441 > 2 for all i < k. The second identity says that, for all N, the number of 
partitions of N into parts congruent to 2 or 3 modulo 5 equals the number of partitions of 
N into distinct parts yy > v2 > +--+ > vp > O = Vey 1 Such that vy; — yj41 > 2 for alli < k. 
One can seek algebraic and/or bijective proofs for these and other identities. Proofs of both 
types are known for the Rogers-Ramanujan identities, but the bijective proofs are all quite 
complicated. 


(mm 


8.7 Euler’s Pentagonal Number Theorem 


We have seen that [],,,(1 + 2’) is the generating function for partitions with distinct 
parts, whereas [],,,(1 — «')~! is the generating function for all integer partitions. This 
section investigates the infinite product [],.,(1 — 2"), which is the multiplicative inverse 
for the partition generating function (see 7.42). The next theorem shows that expanding 


this product leads to a remarkable amount of cancellation of terms due to the minus signs 
(cf. 7.139). 


8.26. Pentagonal Number Theorem. 


[[a — x") ieee YG ae er epg reer) (2) 
i=l n=1 


wa Spatial eee eh lay i eh oS hs pA Oe! Sea es AAO aie 
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Proof. Consider the set DisPar of integer partitions with distinct parts, weighted by area. 
For pz € DisPar, define the sign of to be (—1)“). By modifying the argument in 8.20 to 
include these signs, we obtain 


[[G-2)= S- (—1)6) gl#1, 
i=l we DisPar 


We now define an ingenious area-preserving, sign-reversing involution J on DisPar (due 
to Franklin). Given a partition uw = (ui > U2 >-+-+- > fs) € DisPar, let a > 1 be the largest 
index such that the part sizes j41, M2,..-, Ua are consecutive integers, and let b = yw, be the 
smallest part of u. Figure 8.4 shows how a and 6 can be read from the Ferrers diagram of w. 
For most partitions ju, we define I as follows. If a < b, let I(1) be the partition obtained by 
decreasing the first a parts of 4 by 1 and adding a new part of size a to the end of wu. Ifa > 8, 
let I(41) be the partition obtained by removing the last part of 4 (of size b) and increasing 
the first b parts of 4 by 1 each. See the examples in Figure 8.4. I is weight-preserving and 
sign-reversing, since I(j) has either one more or one fewer part than y. It is also routine to 
check that I(I(u)) = yu. Thus we can cancel out all the pairs of objects {y, I()}. 

It may seem at first glance that we have canceled all the objects in DisPar! However, 
there are some choices of y where the definition of J(u) in the previous paragraph fails 
to produce a partition with distinct parts. Consider what happens in the “overlapping” 
situation a = (yu). If b= a+1 in this situation, the prescription for creating I() leads to a 
partition whose smallest two parts both equal a. On the other hand, if b = a, the definition 
of I(,) fails because there are not enough parts left to increment by 1 after dropping the 
smallest part of y. In all other cases, the definition of J works even when a = ¢(). We 
see now that there are two classes of partitions that cannot be canceled by I (Figure 8.5). 
First, there are partitions of the form (2n,2n—1,...,n+ 1), which have length n and area 
n(3n + 1)/2, for all n > 1. Second, there are partitions of the form (2n — 1,2n — 2,...,n), 
which have length n and area n(3n — 1)/2, for all n > 1. Furthermore, the empty partition 
is not canceled by J. Adding up these signed, weighted objects gives the right side of the 
equation in the theorem. oO 


We can now deduce Euler’s recursion for counting integer partitions that we stated 
in 2.48. 


8.27. Theorem: Partition Recursion. For every n € Z, let p(n) be the number of 
integer partitions of n. The numbers p(n) satisfy the recursion 


p(n) = p(n—1)+p(n—-2) —p(n—5) —p(n—7) + p(n — 12) + p(n — 15) —--- 
= esi (1) [p(n — k(8k — 1)/2) + p(n — k(8k + 1)/2)] 


(8.3) 
for n > 1. The initial conditions are p(0) = 1 and p(n) = 0 for all n < 0. 


Proof. We have proved the identities 


lls of ss ai = lina 


i>1 we Par n=0 
[[a _ a’) = 1+ $0 (-1)h[2kOR VP Spe etont tee 7. 
D1 k>1 


The product of the left sides of these two identities is 1, so the product of the right sides is 
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a= 


=>__=> 


b 


FIGURE 8.4 
Franklin’s partition involution. 


| 


at+l1 


FIGURE 8.5 
Fixed points of Franklin’s involution. 
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also 1. Thus, for each n > 1, the coefficient of x” in the product 


S- p(n)a” oe ee So (-1)F ah? +4 ah (Sk+1)/2) 


n>0 k>1 


is zero. This coefficient also equals p(n) — p(n — 1) — p(n — 2) + p(n — 5) + p(n —7)—--- 
Solving for p(n) yields the recursion in the theorem. O 


8.8 Stirling Numbers of the First Kind 


We can often translate combinatorial recursions into generating functions for the objects 
in question. We illustrate this process by developing generating functions for the Stirling 
numbers of the first and second kind. 

Recall from §3.6 that the signless Stirling number of the first kind (which we denote 
here by c(n,&)) counts the number of permutations of n objects whose functional digraphs 
consist of k disjoint cycles. These numbers satisfy c(n,0) = 0 for n > 0, c(n,n) = 1 for 
n > 0, and 


c(n,k) = c(n—1,k — 1) + (n— 1)c(n —1,k) (0<k<n). 


We also set c(n,k) = 0 whenever k <0 ork >n. 
Define a polynomial f, = v7_9 c(n, k)t® € Qld] for each n > 0, and define the formal 
power series 


As we will see, it is technically convenient to introduce the denominators n! as done here. 
Note that the coefficient of t*x” in F, namely c(n,k)/n!, is the probability that a ran- 
domly chosen permutation of n objects will have k cycles. The coefficient of x° in F is the 
polynomial 1 € Qi]. 

We will use the recursion for c(n, k) to prove the relation (1—2)D,F =tF, where D,F 
is the formal partial derivative of F’ with respect to x (87.16). We first compute 


D.F = ye ) ke rae] =ye in RE en NM pkg 


n>0k=0 n>0 k=0 
n 
~ ~ c(n 1; k k 
a e(n=1,k=1) x, n—-1 ) yk n-1 
Sy et 
n>0 k=0 n>0 k=0 


In the first summation, let m = n—1 and 7 = k — 1. After discarding zero terms, we see 
that 


n 


Ye tert 0 sam me 


n>0 k=0 m>0 j=0 


On the other hand, letting m = n — 1 in the second summation shows that 


Sy AED tat an DY m, k)t yn el 


n>0 k=0 m=>0 k=0 
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since D,(2™/m!) = a™~1/(m—1)!. So we indeed have D, F =tF + xD,F, as claimed. 

We now know a formal differential equation satisfied by F’, together with the initial 
condition F(0) = 1. To find an explicit formula for F’, we need only “solve for F” by 
techniques that may be familiar from calculus (cf. 7.176). However, one must remember 
that all our computations need to be justifiable at the level of formal power series. Let us 
work in the ring R = K|[z]] where K is the field Q(t). We begin by writing the differential 
equation in the form 

(1—2)D,F =tF. 
Since 1—a € Rand F € R have nonzero constant terms, we can divide by these quantities 
(using 7.40) to arrive at 
D,F t 
F “1=¢ 

Now, log(F’) is defined because F'(0) = 1 (see 7.92), and the derivative of log(F’) is (D, F)/F 
by 7.96. On the other hand, using 7.96 and 7.79, we see that ss is the formal derivative 
of log|(1 — x)~"]. We therefore have 


d d 4 
= (log(F)) = = (log{(1 = #)"4)). 
Since both logarithms have constant term zero, we deduce 
log(F) = log[(1 — x)~‘] 


using 7.130 or 7.132. Finally, taking the formal exponential of both sides and using 7.94 
gives 
F=(1-2)*. (8.4) 


Having discovered this formula for F', we can now give an independent verification of its 
correctness by invoking our earlier results on Stirling numbers and generalized powers: 


(l-a)* = 5 be ayn by 7.74 


n>0 
t)Tn 
= paul 7 by 2.76 
nN: 
n>0 
= >of by 2.78 
n>0 
= F 


8.9 Stirling Numbers of the Second Kind 


In this section, we derive a generating function for Stirling numbers of the second kind. 
Recall from §2.9 that S(n,k) is the number of set partitions of an n-element set into k 
nonempty blocks. We will study the formal power series 


G= > Benen € He, 
n>0 k=0 


The following recursion will help us find a differential equation satisfied by G. 
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8.28. Theorem: Recursion for Stirling Numbers. For alln >Oand0<k<n+1, 


n 


S(n+1k)=>> (") sie— ie = 2) 


i=0 
The initial conditions are $(0,0) = 1 and S(n,k) =0 whenever k <0 ork > n. 


Proof. Consider set partitions of {1,2,...,n-+1} into k blocks such that the block containing 
n+ 1 has é other elements in it (where 0 < i < n). To build such a set partition, choose 
the 2 elements that go in the block with n+ 1 in (") ways, and then choose a set partition 
of the remaining n — i elements into k — 1 blocks. The recursion now follows from the sum 
and product rules. (Compare to the proof of 2.53.) oO 


8.29. Theorem: Differential Equation for G. The series G = 7.5.5 op=0 SOB) tk gy ny 
satisfies G(0) = 1 and 
DzG = te*G. 


Proof. The derivative of G with respect to x is 


5 S070 8) pega gm—l SF SALA) sen 
m>0k=0 n20 k=0 


where we have set n = m — 1. Using 8.28 transforms this expression into 


n+l n i n+l n Dee gt 
EL LaF) se -se- nee = yy eS. 


n>0 k=0 i=0 n>0 k=0 i=0 
Setting 7 = k — 1, the formula becomes 
i J) - Sg Se eS S(n—i,j)jPa™* a! 
n>0 j=0 i=0 n=0 i=0 j=0 


Finally, recalling 7.6, the last expression equals 


t aE alos Dy a™ | =te®G. Oo 


8.30. Theorem: Generating Function for Stirling Numbers of the Second Kind. 


Ps ees) tx” = explt(e” — 1)] € Q(t)|[z]]. 


n>0 k=0 


Proof. Call the left side G, as above. We proceed to solve the differential equation 
D,G = te*G and initial condition G(0) = 1. The series G is invertible, having nonzero 
constant term, so we have (D,G)/G = te”. Formally integrating both sides with respect to 
«x (cf. 7.130) leads to log(G) = te? +c, where c € Q(t). As log(G)o = 0 and (te”)o =t, the 
constant of integration must be c = —t. Finally, exponentiating both sides shows that 


G = exp(te” — t) = exp|t(e” — 1)]. Oo 
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8.31. Theorem: Generating Function for S(n,k) for fixed k. For all k > 0, 


S~ S(n,k) k)— a (e —1)* € Q[[e}]. 


nok 
Proof. We have 
et — k 
epli(e — y= eA € Qe), 


k>0 


Extracting the coefficient of t* and using 8.30, we obtain the desired formula. O 


8.10 The Exponential Formula 


Many combinatorial structures can be decomposed into disjoint unions of “connected com- 
ponents.” For example, set partitions consist of a collection of disjoint blocks; permutations 
can be regarded as a collection of disjoint cycles; and graphs are disjoint unions of con- 
nected graphs. The exponential formula allows us to compute the generating function for 
such structures from the generating function for the connected “building blocks” of which 
they are composed. 

For each k > 1, let Cy, be an admissible weighted set of “connected structures of size k,” 
and let Cy = D7 .<c, wt(z) € Ql[t]]. We introduce the generating function C* = }7,5, Sek 
to encode information about all the sets C, (cf. 7.168). Next, we must formally define a 
set of structures of size n that consist of disjoint unions of connected structures of various 
sizes summing to n. For every n > 0, let U, be the set of pairs (S$, f) such that: S is a 
set partition of {1,2,...,n}, and f : S > U,s, Cz is a function such that f(A) € C\,) for 
all A € S. This says that for every m-element block A of the set partition S, f(A) is a 
connected structure of size m. Let wt(S, f) = [] 45 wt(f(A)), and define 


F=53 (x vn) ae 


n>0 \ueUn 
Note that F'(0) = 1. 
8.32. Theorem: Exponential Formula. With notation as above, F = exp(C*). 
Proof. By 7.92, 


m 


‘ 1 C; 
exp(C*)= 7 Zk yk 


! 7s 
m>0 k>1 


This series has constant term 1. For n > 0, the coefficient of x” in exp(C%) is (by 7.10) 


C2 Ci sinie. 
SSE ye ay, Xk pat hm = 2) et 


m=1 ky=1 ko=1 


This coefficient can also be written 


lvl n 
ne = (assis ain) O° Oh 


m=1 (k1,...,km): 
ki>0,kit--+km=n 
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Comparing to the definition of F’, we need to prove that 


Y= at Casta tg) HO Ore 


ucUn m=1 ~ (k1,--.,m): 
ki>0,kit--+km=n 


Consider objects (5S, f) € U, for which |S| = m (so that the object has m “connected 
components” ). By the sum rule, it will suffice to prove 


= = an <m< 
m! S- wt(u) Se eet Cry Cho +++ Chem (l<m<n). 


(S,f)€Un: (Rijs): 
|S|=m ky >0,kit--+km=n 


The left side counts pairs (T, f), where T = (71, T2,...,Tm) is an ordered set partition of 
{1,2,...,n} into m blocks, and for each i < m, f(Ti) € Cyr,;. (We must multiply by m! 
to pass from the set partition {Ti,...,Zm} to the ordered set partition (T1,...,Tm).) The 
right side counts the same set of objects, as we see by the following counting argument. 
Let (ki,...,km) be the sizes of the blocks in the ordered list T = (T1,...,7m). We can 
identify T with a word w = wi w2--+ Wy in R(1*!--.m*~) by letting w; = 7 iff i € T;. It 
for 1 <i < m, we choose f(T) € C,,. The generating function for this choice is Cz,. The 
formula on the right side now follows from the product and sum rules. O 


The generating functions for Stirling numbers (derived in §8.8 and §8.9) are consequences 
of the exponential formula, as we now show. 


8.33. Example: Bell Numbers and Stirling Numbers of the Second Kind. For 
each k > 1, let C, consist of a single element of weight 1. Then C* = >,., la*/k! =e? -1. 
With this choice of the sets C,, an element (S,f) € U, can be identified with the set 
partition S of an n-element set, since there is only one possible choice for the function f. 
Therefore, in this example, 


P= S- wt(u) = S- PO) on. 


ucUn : n>0 
where the Bell number B(n) counts set partitions (see 2.51). The exponential formula now 
gives 


Ss FO) an = exp(e” — 1). 


n>0 


Intuitively, the unique element in C; is the k-element set {1,2,...,4} which is the proto- 
typical example of a k-element block in a set partition. If we let this element have weight t 
(for all k), then wt(S, f) = t!*! will encode the number of blocks in the set partition S$. In 
this case, C* = t(e” — 1) and the exponential formula gives 


” k 
ys S- FOB) gn = exp|t(e” — 1)], 
n>0 k=0 Me 
in agreement with 8.30. 


8.34. Example: Stirling Numbers of the First Kind. For each k > 1, let Cy, consist 
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of all k-cycles on {1,2,...,k}, each having weight t. Since there are (& — 1)! such k-cycles, 


we have 
ue 
CaS u = =e = —tlog(1 — x) = log|(1 — x)~*] 
k>1 k>1 
(the last step used 7.97). Consider an element (S, f) € Un. If A = {t1 < ig < +++ < ig} is 
a block of S, then f(A) is some k-cycle (j1,j2,.--, je), where j1,..., jp is a rearrangement 


of 1,2,...,k. By replacing the numbers 1, 2,...,k in this k-cycle by 71, 72,...,7%, we obtain 
a k-cycle with vertex set A. Doing this for every A € S produces the functional digraph of 
a permutation of n elements. More formally, we have just defined a bijection from U,, to 
the set S;, of permutations of {1,2,...,n}. Note that wt(S, f) = t!S! = t°, where c is the 
number of cycles in the permutation associated to (.S, f). It follows from these observations 
and the exponential formula that 


ed ON phan => ee av (a = exp(C*) = (1-2), 


n>0 k=0 n>0weSn 
in agreement with (8.4). 


In the next example, we use the inverse of the exponential formula to deduce the gener- 
ating function C* from knowledge of the generating function F’. 


8.35. Example: Connected Components of Graphs. For each k > 1, let Cy, consist of 
all connected simple graphs on the vertex set {1,2,...,k}; let each such graph have weight 
1. Direct computation of the generating function C™ is difficult. On the other hand, consider 
an object (5, f) € Un. Given a block A = {i, < ig < +++ < a} in S, f(A) is a connected 
graph with vertex set 1,2,...,4. Renaming these vertices to be 71,i2,...,7% produces a 
connected graph with vertex set A. Doing this for every A € S produces an arbitrary simple 
graph with vertex set {1,2,...,n}. Thus, F is the generating function for such graphs, with 
x keeping track of the number of vertices. There are 2(2) simple graphs on n vertices, since 
we can either include or exclude each of the G) possible edges. Accordingly, 


and so C* = log(F’). Extracting the coefficient of ”/n! on both sides leads to the exact 


formula e “ 
a> _ a a «enn (8.5) 


m=1 (ki nr Ky)? 
ki>O,kit:-+km=n 


for the number of connected simple graphs on n vertices. 


Summary 


e Admissible Sets and Generating Functions. A weighted set (TI, wt) is admissible iff 
Tn = {z € T : wt(z) = n} is finite for all n > 0. The generating function for an 
admissible weighted set is Grwt = >079|Tn|x” € Q[[a]]. Two weighted sets have the 
same generating function iff there is a weight-preserving bijection between them. 
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e Sum Rule for Generating Functions. If an admissible weighted set S is a finite or count- 
able disjoint union of subsets T;, then each T; is admissible and G's = 50; Gr,. 


e Finite Product Rule for Generating Functions. If T = T, x --- x Ty is a finite product 
of admissible weighted sets, with wt(z1,...,2n) = )_, wt(z:), then T is admissible and 


Gr = Ghee Gr, : 


e Infinite Product Rule for Generating Functions. Suppose {T, : n > 1} is a family 
of admissible weighted sets where each T;, has a unique element 1, of weight zero and 
ord(Gr, — 1) > co asn > ov. Let T = [J ., Tn be the set of sequences z = (zn :n > 1) 
with z, € T, and z, = 1, for all large enough n, with wt(z) = >>, wt(z,). Then T is 
admissible and Gr = [[-~_, Gr 


n=1 n° 


e Generating Functions for Trees. The formal series 


vee yoke at pe 1 ae) 


n>1 n>1 ee Shee 
is the generating function for the following sets of weighted trees: (a) binary trees, 
weighted by number of vertices plus 1; (b) nonempty full binary trees, weighted by 
number of leaves; (c) ordered trees, weighted by number of vertices. 


e Compositional Inversion Formulas. Given F = «/R where R € K[[z]] and R(0) 4 0, 
the unique formal series G such that FeG=2=GeF is the generating function for 
the set of terms, where the weight of a term w1---Wp is "Ry, --: Rw,,. Furthermore, 
G(n) = (R")n-1/n = [(d/dx)"-1R"Jo/n! for n > 1. 


e Partition Generating Functions. By building integer partitions row by row or column 
by column and using the product rule, one sees that 


es Sd 


S- HM gle = II i oo tH glel. 
wePar i=l a wePar 
oo “e oo 
~ t= Ta =[et+e)= YS ol. 
ue OddPar k=1 i=1 we DisPar 


Sylvester’s bijection dissects the centered Ferrers diagram of js € OddPar into L-shaped 
pieces that give a partition in DisPar. Glaisher’s bijection replaces each part k = 2°c in 
a partition vy € DisPar (where e > 0 and c is odd) by 2° copies of c, giving a partition 
in OddPar. 


e Pentagonal Number Theorem. Franklin proved 


co CO 

[[¢ cm x’) Set SUEI ere a gnent/2) 

w=1 n=1 
by an involution on signed partitions with distinct parts. The map moves boxes between 
the “staircase” at the top of the partition and the bottom row; this move cancels all 
partitions except the “pentagonal” ones counted by the right side. Since [],. (1 — 2*) 
is the inverse of the partition generating function, we deduce the partition recursion 


p(n) = > o(-1)*" p(n — b(3k — 1)/2) + p(n — (3k + 1)/2)]. 


k>1 
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e Generating Functions for Stirling Numbers. By solving formal differential equations or 
using the exponential formula, one obtains 


Mera ay ts an = explite® — 1). 


n>0k=0 : n>0 k=0 


Hence, di n>49(n, k)x"/n! = (e* — 1)*/k! for k > 0. 


Exponential Formula. Suppose Ty is a weighted set with Gr, € Q|[t]], for & > 0. Let 
C* = 5, (a8 /k)Gr, € Qlt,z]], and let U, be the set of pairs (S, f) where S is a 
set partition of {1,2,...,n} and f is a function on S with f(A) € T),) for all A € S. 
Let wt(S,f) = [Lacs wt(f(A)) and F = Siso Lis, peu, wt(S, f)x"/n!. Then F = 
exp(C*). Informally, if C* is the exponential generating function for a set of connected 
building blocks, then F = exp(C*) is the exponential generating function for the set of 
objects obtained by taking labeled disjoint unions of these building blocks. 


Exercises 


8.36. (a) Let W be the set of all words over a k-letter alphabet, weighted by length. Find 
the generating function Gy. (b) Show that x(G‘,,) is the generating function for the set of 
all nonempty words in which one letter has been underlined. 


8.37. Let S be the set of k-element subsets of N, weighted by the largest element in S. 
Find Gg. What happens if we weight a subset by its smallest element? 


8.38. Fix k © Nt. Use the sum and product rules for weighted sets to find the generating 
function for the set of all compositions with k parts, weighted by the sum of the parts. 


8.39. Compute the images of these partitions under Sylvester’s bijection (see 8.22): 
(a). (16; 57;,3/.1°)5 (b) (7°, 3° 1°); (e).11, 77,5, 3): (d). (0%); 6) (n= 12 3,.2..45,3,1); 
(The notation 5? means two parts equal to 5, etc.) 


8.40. Compute the images of these partitions under the inverse of Sylvester’s bijection: (a) 
(15, 12, 10, 8,6,3,1); (b) (28,12, 7,6,2,1); (c) (11,7,5,3); (d) (21,17, 16, 13, 11,9, 8, 3,2, 1); 
(e) (n,n —1,...,3,2,1). 


8.41. Compute the images of these partitions under Glaisher’s bijection (see 8.23): (a) 
(9, 8,5, 3, 1); (b) (28, 12,7, 6,2, 1); (c) (11,7,5,3); (d) (21, 17, 16, 13, 11, 9, 8, 3, 2, 1). 


8.42. Compute the images of these partitions under the inverse of Glaisher’s bijection: (a) 
(15, 57,37, 1); (b) (187, 119, 7*, 5, 3?, 1) (c) (11,7, 5,3); (d) (9°); (e) (1”). 


8.43. Which partitions map to themselves under Glaisher’s bijection? What about the 
generalized bijection in 8.24? 


8.44. Let H and K be the maps in the proof of 8.24. (a) Find H(25, 17, 17, 10, 9,6, 6, 5, 2, 2), 
for d = 3,4,5. (b) Find K(8!°, 77, 229, 13°), for d =3,5,6. 


8.45. Calculate the image of each partition under Franklin’s involution (§8.7): 
(a) (17, 16, 15, 14, 13, 10, 8, 7, 4); (b) (17, 16, 15, 14, 13, 10, 8); (c) (n,n—1,...,3,2, 1); (d) (n). 
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8.46. Find the generating function for the set of all integer partitions that satisfy each 
restriction below: (a) all parts are divisible by 3; (b) all parts are distinct and even; (c) odd 
parts appear at most twice; (d) each part is congruent to 1 or 4 mod 7; (e) for each 7 > 0, 
there are at most 7 parts of size 7. 


8.47. Give combinatorial interpretations for the coefficients in the following formal power 
series: (a) []j4(1—2)~1; (b) Tso(l t2%4)(1 +2); (c) [Tjso(1—2 +a%)/(1—2'). 


8.48. (a) Show that the first Rogers-Ramanujan identity (see 8.25) can be written 


k . . . . 
ghee. eEsrauina [Seana = 1 + a (C= =e (b) Find a similar formulation 
of the second Rogers-Ramanujan identity. (c) Verify the Rogers-Ramanujan identities for 
partitions of N = 12 by explicitly listing all the partitions satisfying the relevant restrictions. 


8.49. Give a detailed verification of the claim in 8.11 that the quadratic equation G? — G+ 
x = 0 has the unique solution G = (1 — V1 — 4x)/2 in Q|[z]]. 


8.50. Prove that a nonempty full binary tree with a leaves has a — 1 non-leaf vertices. 


8.51. Let f be the bijection in Figure 8.1. Compute f(T’), where T is the binary tree in 
Figure 2.12. 


8.52. Let g be the bijection shown in Figure 8.2. Verify that the number of vertices in g(t) 
equals the number of leaves in t, for each full binary tree t. 


8.53. (a) Describe the inverse of the bijection g shown in Figure 8.2. (b) Calculate the 
image of the ordered tree in Figure 3.18 under g7!. 


8.54. List all full binary trees with 5 leaves, and compute the image of each tree under the 
map g in Figure 8.2. 


8.55. Verify that (R")n—1/n = ((d/dx)""1R")o/n! for alln > 1 and RE K|[z]]. 
8.56. Give an algebraic proof of 8.24 using formal power series. 


8.57. (a) Carefully verify that the maps H and K in 8.23 are two-sided inverses. (b) Repeat 
part (a) for the maps H and K in 8.24. 


8.58. (a) Verify that the partition (2n, 2n—1,...,n+1) (one of the fixed points of Franklin’s 
involution) has area n(3n + 1)/2. (b) Verify that the partition (2n — 1,2n — 2,...,n) has 
area n(3n — 1)/2. 


8.59. Carry out the computations showing how the equation C* = log(F’) leads to for- 
mula (8.5). 


8.60. Rewrite (8.5) as a sum over partitions of n. 


8.61. Use (8.5) to compute the number of connected simple graphs with vertex set: (a) 
{1, 2,3, 4}; (b) {1,2,3,4, 5}. 


8.62. (a) Modify (8.5) to include a power of t that keeps track of the number of edges in 
the connected graph. (b) How many connected simple graphs with vertex set {1, 2,3, 4,5, 6} 
have exactly seven edges? 


8.63. (a) Find the generating function for the set of all Dyck paths, where the weight of a 
path ending at (n,n) is n. (b) A marked Dyck path is a Dyck path in which one step (north 
or east) has been circled. Find the generating function for marked Dyck paths. 
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8.64. Recall that >7,,59 Soe SOE) pk yn = exp|t(e” — 1)]. Use partial differentiation of this 
generating function to find generating functions for: (a) the set of set partitions, where one 
block in the partition has been circled; (b) the set of set partitions, where one element of 
one block in the partition has been circled. 


8.65. (a) List all terms of length at most 5. (b) Use (a) and 8.14 to write down explicit 
formulas for the first five coefficients of the compositional inverse of «/R as combinations 
of the coefficients of R. (c) Use (b) to find the first five terms in the compositional inverse 
of x/(1 — 3a + 2x? + 5a‘), 


8.66. Use 8.15 to compute the compositional inverse of the following formal series: (a) xe?”; 
(b) x — x; (c) a/(1 +z); (d) x — 4x4 + 427. 


8.67. Let S be the set of paths that start at (0,0) and take horizontal steps (right 1, up 
0), vertical steps (right 0, up 1), and diagonal steps (right 1, up 1). By considering the final 
step of a path, find an equation satisfied by Gg and solve for Gg, taking the weight of a 
path ending at (c,d) to be: (a) the number of steps in the path; (b) c+ d; (c) ¢. 


8.68. For fixed k > 1, find the generating function for integer partitions with: (a) k nonzero 
parts; (b) & nonzero distinct parts. (c) Deduce summation formulas for the infinite products 


His. —x*)-1 and His. +2°), 


8.69. A ternary tree is either 0 or a 4-tuple (e,¢),t2,t3), where each t, is itself a ternary 
tree. Find an equation satisfied by the generating function for ternary trees, weighted by 
number of vertices. 


8.70. Let S be the set of ordered trees where every node has at most two children, weighted 
by the number of vertices. (a) Use the sum and product rules to find an equation satisfied 
by Gg. (b) Solve this equation for G's. (c) How many trees in S have 7 vertices? 


8.71. Find a formula for the number of simple digraphs with n vertices such that the graph 
obtained by erasing loops and ignoring the directions on the edges is connected. 


8.72. Prove that the number of integer partitions of N in which no even part appears more 
than once equals the number of partitions of N in which no part appears 4 or more times. 


8.73. Prove that the number of integer partitions of N that have no part equal to 1 and 
no parts that differ by 1 equals the number of partitions of N in which no part appears 
exactly once. 


8.74. (a) Write down an infinite product that is the generating function for integer par- 
titions with odd, distinct parts. (b) Show that the generating function for self-conjugate 
partitions (i.e., partitions such that \’ = A) is 1+ 3°72, a /((1 —2a7)(1—24)---(1—2?*)), 
(c) Find an area-preserving bijection between the sets of partitions in (a) and (b), and 
deduce an associated formal power series identity. 


8.75. Evaluate \o, x*(1—2)-*. 
8.76. How many integer partitions of n have the form (i/(i + 1)*) for some i, j,k > 0? 


8.77. Dobinski’s Formula. Prove that the Bell numbers (see 2.51) satisfy B(n) = 
ety otk (kl) form > 0: 


8.78. Show that, for all N > 0, (—1)%| OddParn DisParN Par(N)| equals |{y € Par(N) : 
£(4) is even}| — |{u € Par(N) : €(js) is odd}}. 
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8.79. (a) Use an involution on the set Par x DisPar to give a combinatorial proof of the 
identity []7, 45 TP, (1 - 2”) =1. (b) More generally, for S C Nt, prove combinatori- 


ally that Ines = lest = x”) = 1. 


8.80. (a) Find a bijection from the set of terms of length n to the set of binary trees with 
n —1 nodes. (b) Use (a) to formulate a version of 8.14 that expresses the coefficients in the 
compositional inverse of x«/R as sums of suitably weighted binary trees. 


8.81. (a) Find a bijection from the set of terms of length n to the set of Dyck paths ending 
at (n—1,n—1). (b) Use (a) to formulate a version of 8.14 that expresses the coefficients in 
the compositional inverse of «/R as sums of suitably weighted Dyck paths. 


8.82. Let d(n,k) be the number of derangements in S,, with k cycles. Find a formula for 
n d(n,k n 

endo 0 TO fh : 

8.83. Compute )),.)dna#”/n!, where d, is the number of derangements of n objects. 


8.84. Let S,(n,k) be the number of set partitions of {1,2,...,n} into k blocks where no 
block consists of a single element. Find a formula for 3>,,59 )o1—0 Sik) phan 


8.85. What is the generating function for set partitions in which all block sizes must belong 
to a given subset TC Nt? 


8.86. Let A(n,k) be the number of ways to assign n people to k committees in such a way 
that each person belongs to exactly one committee, and each committee has one member 
designated as chairman. Find a formula for >...) )op—0 AGE) pkg 


8.87. Involution Proof of Euler’s Partition Recursion. (a) For fixed n > 1, prove 
AAI 1) p(n — (37? + j)/2) = 0 by verifying that the following map I is a sign-reversing 
involution with no fixed points. The domain of J is the set of pairs (j, A) with 7 € Z and 
dA € Par(n — (37? + j)/2). To define I(j,X), consider two cases. If £(A) + 37 > A1, set 
I(j,A) = (g — 1,) where p is formed by preceding the first part of X by (A) + 37 and 
then decrementing all parts by 1. If (A) + 37 < Aa, set I(j,A) = (f +1,v) where v is 
formed by deleting the first part of A, incrementing the remaining nonzero parts of X by 1, 
and appending an additional A, — 37 — @(A) — 1 parts equal to 1. (b) For n = 21, 7 = 1, 
A = (5,5, 4,3, 2), compute I(j, A) and verify that I(I(j, A)) = (7, A). 


8.88. We say that an integer partition extends a partition yu iff for all k, & occurs in 
d at least as often as k occurs in pu; otherwise, A avoids yu. Suppose {y’ : i > 1} and 
{v* :i > 1} are two sequences of distinct, nonzero partitions such that for all finite S C NT, 
Dies Hl = Vies "|. (a) Prove that for every N, the number of partitions of N that avoid 
every ju" equals the number of partitions of N that avoid every v’. (b) Show how 8.24 can 
be deduced from (a). (c) Use (a) to prove that the number of partitions of N into parts 
congruent to 1 or 5 mod 6 equals the number of partitions of N into distinct parts not 
divisible by 3. 


Notes 


The applications of formal power series to combinatorial problems go well beyond the topics 
covered in this chapter. The texts [10, 127, 139] offer more detailed treatments of the uses 
of generating functions in combinatorics. Two more classical references are [90, 113]. For 
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an introduction to the vast subject of partition identities, the reader may consult [5, 102). 
Sylvester’s bijection appears in [129], Glaisher’s bijection in [54], and Franklin’s involution 
in [44]. The Rogers-Ramanujan identities are discussed in [106, 117]; Garsia and Milne gave 
the first bijective proof of these identities [49, 50]. Our treatment of compositional inversion 
closely follows the presentation in [107]. 
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Permutations and Group Actions 


This chapter contains an introduction to some aspects of group theory that are directly 
related to combinatorial problems. The first part of the chapter gives the basic definitions 
of group theory and derives some fundamental properties of symmetric groups. We apply 
this material to give combinatorial derivations of the basic properties of determinants. The 
second part of the chapter discusses group actions, which have many applications to algebra 
and combinatorics. In particular, group actions can be used to solve counting problems in 
which symmetry must be taken into account. For example, how many ways can we color a 
5x5 chessboard with seven colors, if all rotations and reflections of a given colored board are 
considered the same? The theory of group actions provides systematic methods for solving 
problems like this one. 


eS ee 


9.1 Definition and Examples of Groups 


9.1. Definition: Groups. A group consists of a set G and a binary operation x: Gx G > 
G subject to the following axioms: 


Vu,y,z€ Gyux(y*z)=(xxy)*z (associativity); 
de€ G,VxeG,rxe=u=exn (identity); 
Va €G,dy € Gyrxy=e=yru (inverses). 


The requirement that x map G x G into G is often stated explicitly as the following axiom: 
Va,yEeG,rxyeG (closure). 
A group G is called abelian or commutative iff G satisfies the additional axiom 
Va,yeG,rxy=y*u (commutativity). 


9.2. Example: Additive Groups. The set Z of all integers, with addition as the operation, 
is a commutative group. The identity element is e = 0 and the (additive) inverse of x € Z 
is —x € Z. Similarly, Q and R and C are all commutative groups under addition. N* is not 
a group under addition because there is no identity element in the set N*. N is not a group 
under addition because 1 € N has no additive inverse in the set N. The three-element set 
S = {-1,0,1} is not a group under addition because closure fails (1+ 1=2 ¢ 8S). 


9.3. Example: Multiplicative Groups. The set Qt of strictly positive rational numbers 
is a commutative group under multiplication. The identity element is e = 1 and the inverse 
of a/b € Q® is b/a. Similarly, R* is a group under multiplication. The set Q is not a group 
under multiplication because 0 has no inverse. On the other hand, Q ~ {0}, R ~ {0}, and 
C ~ {0} are groups under multiplication. So is the two-element set {—1,1} C Q, and the 
four-element set {1,2,—1,—i} CC. 
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9.4. Example: Symmetric Groups. Let X be any set, and let Sym(X) be the set of 
all bijections f : X — X. For f,g € Sym(X), define f og to be the composite function 
that sends x € X to f(g(x)). Then fog € Sym(X) since the composition of bijections 
is a bijection, so the axiom of closure holds. Given f,g,h € Sym(X), note that both of 
the functions (fog)oh: X > X and fo(goh): X > X senda € X to f(g(A(a))). 
So these functions are equal, proving the axiom of associativity. Next, take e to be the 
bijection idx : X — X, which is defined by idx (x) = x for all « € X. One immediately 
checks that foidy = f = idx of for all f € Sym(X), so the identity axiom holds. Finally, 
given a bijection f € Sym(X), there exists an inverse function f~! : X — X that is also a 
bijection, and which satisfies fo f~! = idx = f~1o f. So the axiom of inverses holds. This 
completes the verification that (Sym(X),0°) is a group. This group is called the symmetric 
group on X, and elements of Sym(X) are called permutations of X. Symmetric groups play 
a central role in group theory and are closely related to group actions. In the special case 
when X = {1,2,...,n}, we write S,, to denote the group Sym(X). 

Most of the groups Sym(X) are not commutative. For instance, consider f,g € S3 given 
by 

#1) =2, F2)=1, #)=3; g(1) =3, 9(2) =2, g(8)=1. 


) 
We see that (fo g)(1) = f(g(1)) = 3, whereas (go f)(1) = g(f(1)) = 2. So fog A#gof, 
and the axiom of commutativity fails. 


9.5. Example: Integers modulo n. Let n be a fixed positive integer. Consider the set 


Zn =n = {0,1,2,...,n—1}. We define a binary operation on Z,, by setting, for all x,y € Zn, 


_f @+y ife+y<n; 
voy={ ety-—n ifa+y>n. 


Closure follows from this definition, once we note that 0 < a+y < 2n—2 for x,y € Zp. 
The identity element is 0. The inverse of 0 is 0, while for x > 0 in Z,, the inverse of x is 
n—a«x €Z,. To verify associativity, one may prove the relations 


gty+z ife+ytz<n; 
(e@y)@z=¢ awtytz—-n ifn<axtytz< 2n; =r@O(y@2), (9.1) 
t+yt2z-2n ifQn<ae+yt+2z< 3n; 


which can be established by a tedious case analysis. Commutativity of 6 follows from the 
definition and the commutativity of ordinary integer addition. We conclude that (Z,,,@) 
is a commutative group containing n elements. In particular, for every positive integer n, 
there exists a group of cardinality n. 


9.6. Definition: Multiplication Tables. If (G,x) is a group and G = {a1,...,2,} is 
finite, then a multiplication table for G is an n x n table, with rows and columns labeled 
by the x,’s, such that the element in row 7 and column j is x; x* 7;. When the operation is 
written additively, we refer to this table as the addition table for G. It is customary, but 
not mandatory, to take x; to be the identity element of G. 


9.7. Example. The multiplication tables for ({1,i,—1,—7}, x) and for (Z4,®) are shown 
here: 


The reader may notice a relationship between the two tables: each row within the table is 
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obtained from the preceding one by a cyclic shift one step to the left. (Using terminology 
to be discussed later, this happens because each of the two groups under consideration is 
cyclic of size four.) 

One can define a group operation by specifying its multiplication table. For example, 
here is the table for another group of size four (which turns out not to be cyclic): 


The identity and inverse axioms can be checked from inspection of the table (here a is the 
identity, and every element is equal to its own inverse). There is no quick way to verify 
associativity by visual inspection of the table, but this axiom can be checked exhaustively 
using the table entries. 

All of the groups in this example are commutative. This can be read off from the multi- 
plication tables by noting the symmetry about the main diagonal line (the entry x; * x; in 
row 7 and column j always equals the entry x; * xz; in row j and column 2). 


DT 


9.2. Basic Properties of Groups 


We now collect some facts about groups that follow from the defining axioms. 

First, the identity element e in a group (G,x) is unique. For, suppose e’ € G also satisfies 
the identity axiom. On one hand, ex e’ = € since e’ is an identity element. On the other 
hand, exe’ = e’ since e is an identity element. So e = e’. (The very statement of the 
inverse axiom makes implicit use of the uniqueness of the identity.) We use the symbol eg 
to denote the identity element of an abstract group G. When the operation is addition or 
multiplication, we write 0g or 1g instead, dropping the G if it is understood from context. 

Similarly, the inverse of an element x in a group G is unique. For suppose y, y’ € G both 
satisfy the condition in the inverse axiom. Then 


y=yre=yx (ary) =(yxa) xy =exy =y. 


We denote the unique inverse of x in G by the symbol z~!. When the operation is written 
additively, the symbol —z is used. 

A product such as «xy is often written xy, except in the additive case. The associativity 
axiom can be used to show that any parenthesization of a product 7122---2,, will give the 
same answer (see 2.148), so it is permissible to omit parentheses in products like these. 


9.8. Theorem: Cancellation Laws and Inverse Rules. Suppose a,xz,y are elements 
in a group G. (a) ax = ay implies x = y (left cancellation); (b) za = ya implies x = y (right 
cancellation); (c) (v~1)~! = a; (d) (xy)~! = y~'a7 (inverse rule for products). 

Proof. Starting from az = ay, multiply both sides on the left by a~! to get a~!(ax) = 
a~*(ay). Then the associativity axiom gives (a~'a)a = (a~‘a)y; the inverse axiom gives 
ex = ey; and the identity axiom gives x = y. Right cancellation is proved similarly. Next, 
note that 
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by the definition of the inverse of x and of x~!; right cancellation of x~! yields (a~!)~! = x. 
Similarly, routine calculations using the group axioms show that 


(xy) (xy) =e = (y'a~")(ay), 
so right cancellation of xy gives the inverse rule for products. O 


9.9. Definition: Exponent Notation. Let G be a group written multiplicatively. Given 
x € G, recursively define «9 = 1 = eg and #"*! = x” xz for all n > 0. To define negative 
powers of x, set a~" = (x~1)” for alln > 0. 


Informally, for positive n, x” is the product of n copies of x. For negative n, x” is the 
product of |n| copies of the inverse of x. Note in particular that z' = x and x~+ is the 
inverse of x (in accordance with the conventions introduced before this definition). When 
G is written additively, we write nz instead of x”; this denotes the sum of n copies of x for 
n > 0, or the sum of |n| copies of —x for n < 0. 


9.10. Theorem: Laws of Exponents. Suppose G is a group written multiplicatively, 
x éG, and m,n € Z. Then v1” = x™ax” and 2” = (a")™. If x,y € G satisfy ry = yz, 
then (xy)” = «"y”. In additive notation, these results read: (m+n)x = ma+na; (mn)ax = 
m(nx); and n(a+y)=nx+ny whenx«+y=y+a. 


The idea of the proof is to use induction to establish the results for m,n > 0, and then 
use case analyses to handle the situations where m or n or both is negative. We leave the 
details as an exercise for the reader. 


ee 


9.3. Notation for Permutations 


Permutations and symmetric groups arise frequently in the theory of groups and group 
actions. So we will now develop some notation for describing permutations. 


9.11. Definition: Two-Line Form of a Function. Let X be a finite set, and let 
%1,---,% be a list of all the distinct elements of X in some fixed order. The two-line 
form of a function f : X — X relative to this ordering is the array 


f = ( Vy v2 eee rn ) 
f(ti) f(t2) +++ fl@n) J 
If X = {1,2,...,n}, we usually display the elements of X on the top line in the order 
WD tenths 


abede 

bc ea b 

X = {a,b,c,d,e} such that f(a) = b, f(b) =c, f(c) =e, f(d) =a, and f(e) = b. This 
function is not a permutation, since b occurs twice in the bottom row and d never occurs. 
123 4 5 
24 5 1 3 
g(2) = 4, g(3) = 5, g(4) = 1, and g(5) = 3. Observe that the inverse of g sends 2 to 1, 4 to 
2, and so on. So, we obtain one possible two-line form of g~! by interchanging the lines in 


the two-line form of g: 
HAS 4 5 1 3 
ee Ce ae oe ate 


9.12. Example. The notation f = defines a function on the set 


The notation g = defines an element of S; such that g(1) = 2, 


Permutations and Group Actions 323 


—+@)—-6)—> — 2)—-@)-—-@) 
Pe re ie > 


FIGURE 9.1 
Digraph associated to the permutation h. 


It is customary to write the numbers in the top line in increasing order. This can be 
accomplished by sorting the columns of the previous array: 


a Se aS 
Be 0 Ne Sas a ais ae a 


Recall that the group operation in Sym(X) is composition. We can compute the composition 
of two functions written in two-line form by tracing the effect of the composite function on 
each element. For instance, 


a bed o{ 4 bec d\ fabed 
b dace acdbjJ) \baed/)’ 
because the left side maps a to a and then to b; b maps to c and then to a; and so on. 


If the ordering of X is fixed and known from context, we may omit the top line of the 
two-line form. This leads to one-line notation for a function defined on X. 


9.13. Definition: One-Line Form of a Function. Let X = {a < 12 <--+: < a} 
be a finite totally ordered set. The one-line form of a function f : X — X is the array 
[f(v1) f(x2)--+ f(an)]. We use square brackets to avoid a conflict with the cycle notation 
to be introduced below. Sometimes we omit the brackets, identifying f with the word 


f(@1)f(@2)-- Fan). 


9.14. Example. The functions f and g in the preceding example are given in one-line form 
by writing f = [bc e a b] and g = [2 45 1 3). Note that the one-line form of an element 
of Sym(X) is literally a permutation of the elements of X, as defined in §1.4. This explains 
why elements of this group are called permutations. 


9.15. Cycle Notation for Permutations. Assume X is a finite set. Recall from §3.6 that 
any function f : X — X can be represented by a digraph with vertex set X and directed 
edges {(i, f(i)) :i © X}. A digraph on X arises from a function in this way iff every vertex 
in X has outdegree 1. In 3.45 we proved that the digraph of a permutation is a disjoint 
union of directed cycles. For example, Figure 9.1 displays the digraph of the permutation 


123 4 5 6 7 8 9 10 
“\ 7 8 4 3 10 25 69 1 }° 
We can describe a directed cycle in a digraph by traversing the edges in the cycle and listing 
the elements we encounter in the order of visitation, enclosing the whole list in parentheses. 
For example, the cycle containing 1 in Figure 9.1 can be described by writing (1, 7,5, 10). 
The cycle containing 9 is denoted by (9). To describe the entire digraph of a permutation, 


we write down all the cycles in the digraph, one after the other. For example, h can be 
written in cycle notation as 


h = (1,7,5, 10)(2, 8, 6)(3, 4)(9). 
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This cycle notation is not unique. We are free to begin our description of each cycle at any 
vertex in the cycle, and we are free to rearrange the order of the cycles. Furthermore, by 
convention it is permissible to omit some or all cycles of length 1. For example, some other 
cycle notations for h are 


h = (5, 10, 1,7)(3, 4)(9)(6, 2, 8) = (2,8, 6)(4, 3)(7, 5, 10, 1). 


To compute the inverse of a permutation written in cycle notation, we reverse the orientation 
of each cycle. For example, 


h—" = (10,5, 7, 1)(6, 8, 2)(4,3)(9). 
9.16. Example. Using cycle notation, we can list the six elements of 53 as follows: 
S3 = {(1)(2)(3), (1,2), (1,3), (2,3), (1, 2,3), (1,3, 2)}- 


9.17. Example. To compose permutations written in cycle notation, we must see how the 
composite function acts on each element. For instance, consider the product (3,5)(1, 2,4) 0 
(3,5, 2,1) in S;. This composite function sends 1 to 3 and then 3 to 5, so 1 maps to 5. Next, 
2 maps first to 1 and then to 2, so 2 maps to 2. Continuing similarly, we find that 


1 


2 3 4 5 
(3,5)(1, 2,4) 0 (3,5, 2, 1) a ( 5 9 3 1 


: ) = (1,5,4)(2)(3). 


With enough practice, one can proceed immediately to the cycle form of the answer without 
writing down the two-line form or other scratch work. 


9.18. Definition: k-cycles. For k > 1, a permutation f € Sym(X) whose digraph consists 
of one cycle of length k and all other cycles of length 1 is called a k-cycle. 


9.19. Remark. We can view the cycle notation for a permutation f as a way of factorizing 
f in the group S,, into a product of cycles. For example, 


(1, 7,5, 10)(2, 8, 6)(3, 4)(9) = (1, 7, 5, 10) o (2, 8, 6) 0 (3, 4) 0 (9). 


Here we have expressed the single permutation on the left side as a product of four other 
permutations in Sj9. The stated equality may be verified by checking that both sides have 
the same effect on each & € {1,2,..., 10}. 


9.20. Definition: cyc(f) and type(f). Given a permutation f € Sym(X), let cyc(f) be 
the number of components (cycles) in the digraph for f. Let type(f) be the list of sizes of 
these components, including repetitions and written in weakly decreasing order. 


Note that type(f) is an integer partition of n = |X]. 


9.21. Example. The permutation h in Figure 9.1 has cyc(h) = 4 and type(h) = (4,3, 2, 1). 
The identity element of S;,, namely id = (1)(2)---(n), has cyc(id) = n and type(id) = 
(1,...,1). Table 9.1 displays the 24 elements of S4 in cycle notation, collecting together all 
permutations with the same type and counting the number of permutations of each type. 
In 9.134, we will give a general formula for the number of permutations of n objects having 
a given type. 
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TABLE 9.1 
Elements of $4. 


Type Permutations of this type Count 


9.4 Inversions and Sign 


In this section, we use inversions of permutations to define the sign function sgn : S, —- 
{+1,—1}. We then study factorizations of permutations into products of transpositions to 
derive facts about the sgn function. Let us begin by recalling the definition of inversions 


(§6.2). 


9.22. Definition: Inversions and Sign of a Permutation. Let w = w,w2---wn € Sy 
be a permutation written in one-line form. An inversion of w is a pair of indices i < 7 such 
that w; > w;. The number of inversions of w is denoted inv(w). Furthermore, the sign of 
w is defined to be sgn(w) = (—1)i2™™), 


9.23. Example. Given w = 42531, we have inv(w) = 7 and sgn(w) = —1. The seven 
inversions of w are (1,2), (1,4), (1,5), (2,5), (3,4), (8,5), and (4,5). For instance, (1,4) is 
an inversion because w; = 4 > 3 = wa. The following table displays inv(f) and sgn(f) for 
all f € Ss: 


We want to understand how the group operation in S,, (composition of permutations) is 
related to inversions and sign. For this purpose, we introduce the concept of a transposition. 


9.24. Definition: Transpositions. A transposition in S, is a permutation f of the form 
(i,j), for some i,j <n. Note that f(t) = 7, f(j) =i, and f(k) =k for all k 41,7. A basic 


transposition in S, is a transposition (7,2 + 1), for some i < n. 


The following lemmas illuminate the connection between basic transpositions and the 
process of sorting the one-line form of a permutation into increasing order. 


9.25. Lemma: Basic Transpositions and Sorting. Let w= wi ---wwi41-+:Wn € Sn 
be a permutation in one-line form. For each i < n, 


wo (i,t +1) = wie: Wig Wie Wn- 
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So right-multiplication by the basic transposition (i,t + 1) interchanges the elements in 
positions 1 andi+1 ofw. 


Proof. Let us evaluate the function f = wo (i,i +1) at each k < n. When k =i, f(t) = 
wii +1). When k =i+1, f(i+1) = w(t). When & Ai and k 4i+1, f(k) =k. So the 
one-line form of f is wy +++ wj41Wi +++ Wn, as desired. Oo 


9.26. Lemma: Basic Transpositions and Inversions. Let w = w,---Wn € Sp be a 
permutation in one-line form, and let 7 < n. Then 


: = inv(w) +1. if w;< wi44; 
inv(wo(t,¢-+1)) = { as -1 ifwe> a, 


Consequently, in all cases, we have 
sen(w o (7,4 + 1)) = —sgn(w). 


Proof. We use the result of the previous lemma to compare the inversions of w and w’ = 
wo (i,i+1). Let 7 < k be two indices between 1 and n, and consider various cases. First, 
if 7 A7i,i+1 and k 47,741, then (j,k) is an inversion of w iff (j,&) is an inversion of w’, 
since wj = w; and wz = wi,. Second, if 7 =i and k >i +1, then (i,k) is an inversion of w 
iff (¢ + 1,k) is an inversion of w’, since w; = wi, and wz = w;. Similar results hold in the 
cases (j =i+1<k), (j <k=7), and (j <i, k =i+1). The critical case is when 7 = 7 and 
k=i+1. If w; < wi41, then (j,k) is an inversion of w’ but not of w. If w; > wi41, then 
(j,&) is an inversion of w but not of w’. This establishes the first formula in the lemma. 
The remaining formula follows since (—1)t! = (—1)~' = -1. oO 


The proof of the next lemma is left as an exercise. 


9.27. Lemma. For all n > 1, the identity permutation id = 1,2,...,n is the unique element 
of S,, satisfying inv(id) = 0. We have sgn(id) = +1. 


If f = (¢,i+1) is a basic transposition, then the ordered pair (7,7+1) is the only inversion 
of f, so inv(f) = 1 and sgn(f) = —1. More generally, we now show that any transposition 
has sign —1. 


9.28. Lemma. If f = (i, 7) is any transposition, then sgn(f) = —1. 


Proof. Since (i,j) = (j,i), we may assume that i < j. Let us write f in two-line form: 


We can find the inversions of f by inspecting the two-line form. The inversions are: all (z, k) 
with i<k <j; and all (k,j) with i < k < j. There are j —1 inversions of the first type and 
j —i—1 inversions of the second type, hence 2(7 — 7) — 1 inversions total. Since this number 
is odd, we conclude that sgn(f) = —1. Oo 


9.29. Theorem: Inversions and Sorting. Let w = w,w2---wp € S, be a permutation in 
one-line form. The number inv(w) is the minimum number of steps required to sort the word 
w into increasing order by repeatedly interchanging two adjacent elements. Furthermore, w 
can be factored in S;, into the product of inv(w) basic transpositions. 


Permutations and Group Actions 327 


Proof. Given w € Sy, it is certainly possible to sort w into increasing order in finitely many 
steps by repeatedly swapping adjacent elements. For instance, we can move 1 to the far 
left position in at most n — 1 moves, then move 2 to its proper position in at most n — 2 
moves, and so on. Let m be the minimum number of moves of this kind that are needed to 
sort w. By 9.25, we can accomplish each sorting move by starting with w and repeatedly 
multiplying on the right by suitable basic transpositions. Each such multiplication either 
increases or decreases the inversion count by 1, according to 9.26. At the end, we have 
transformed w into the identity permutation. Combining these observations, we see that 
0 = inv(id) > inv(w) — m, so that m > inv(w). On the other hand, consider the following 
particular sequence of sorting moves starting from w. If the current permutation w* is not 
the identity, there exists a smallest index i with w; > wj,,. Apply the basic transposition 
(i,i +1), which reduces inv(w*) by 1, and continue. This sorting method will end in exactly 
inv(w) steps, since id is the unique permutation with zero inversions. This proves it is 
possible to sort w in inv(w) steps, so that m < inv(w). 

To prove the last part of the theorem, recall that the sorting process just described can 
be implemented by right-multiplying by suitable basic transpositions. We therefore have an 
equation in S,, of the form 


wo (21,41 +1) 0 (ig, 22 +1) 0-++ 0 (im, im +1) = id. 
Solving for w, and using the fact that (7, 7)~' = (j,i) = (i, 7), we get 
W = (ims tm +1) 0-+--+0 (42,42 +1) 0 (41,4, +1), 
which expresses w as a product of m basic transpositions. oO 


9.30. Example. Let us apply the sorting algorithm in the preceding proof to write w = 
42531 as a product of inv(w) = 7 basic transpositions. Since 4 > 2, we first multiply w on 
the right by (1,2) to obtain 


wo (1,2) = 24531. 


(Observe that inv(24531) = 6 = inv(w) — 1.) Next, since 5 > 3, we multiply on the right by 
(3, 4) to get 
wo (1,2) 0 (3,4) = 24351. 


The computation continues as follows: 


wo (1,2)0(3,4)0(2,3) = 23451; 
wo (1,2) 0 (3,4) 0 (2,3)0(4,5) = 23415; 
o (1,2) 0 (3,4) 0 (2,3) 0 (4,5) 0 (3,4) = 23145; 
wo (1,2) 0 (3,4) 0 (2,3) 0 (4,5) 0 (3,4) 0 (2,3) = 21345; 
wo (1,2) 0 (3,4) 0 (2,3) 0 (4,5) 0 (3,4) 0 (2,3) 0(1,2) = 12345 =id. 


We now solve for w, which has the effect of reversing the order of the basic transpositions 
we used to reach the identity: 


w = (1,2) 0 (2,3) 0 (3,4) 0 (4,5) 0 (2,3) o (3, 4) o (1, 2). 


It is also possible to find such a factorization by starting with the identity word and “un- 
sorting” to reach w. Here it will not be necessary to reverse the order of the transpositions 
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at the end. We illustrate this idea with the following computation: 


id = 12345; 
(3,4) 12435; 
(2,3) = 14235; 
(1,2) = 41235; 
0 (2,3) = 42135; 
(4,5) = 42153; 
(3,4) 42513: 
(4,5) = 42531=w. 


So w = (3,4) o (2,3) 0 (1,2) o (2,3) 0 (4,5) o (3,4) o (4,5). Observe that this is a different 
factorization of w from the one obtained earlier, although both involve seven basic transpo- 
sitions. This shows that factorizations of permutations into products of basic transpositions 
are not unique. It is also possible to find factorizations involving more than seven factors, 
by interchanging two entries that are already in the correct order during the sorting of w 
into id. So the number of factors in such factorizations is not unique either; but we will see 
shortly that the parity of the number of factors (odd or even) is uniquely determined by w. 
In fact, the parity is odd when sgn(w) = —1 and even when sgn(w) = +1. 


We now have enough machinery to prove the fundamental properties of sgn. 


9.31. Theorem: Properties of Sign. (a) For all f,g € Sy, sgn(f og) = sgn(f) - sgn(g). 
(b) For all f € S,, sgn(f~1) = sgn(f). 


Proof. (a) If g = id, then the result is true since f og = f and sgn(g) = 1 in this case. If 
t = (t,4+1) is a basic transposition, then 9.26 shows that sgn( fot) = — sgn(f). Given a non- 
identity permutation g, use 9.29 to write g as a nonempty product of basic transpositions, 
say g =t,0t20---t,. Then, for every f € S;,, iteration of 9.26 gives 


I 


sen( ft, ---t,-1t,) = —sgn(ftr ---tx_1) 
(—1)? sgn(fti -+-tx—2) =-+- = (—1)* sgn(f). 


In particular, this equation is true when f = id; in that case, we obtain sgn(g) = (—1)*. 
Using this fact in the preceding equation produces sgn(fog) = sgn(g) sgn(f) = sgn(f) sgn(g) 
for all f € Sj. 

(b) By part (a), sgn(f) - sen(f~") = sgn(f o f~*) = sgn(id) = +1. If sgn(f) = +1, it 
follows that sgn(f~!) = +1. If instead sgn(f) = —1, then it follows that sen(f~!) =—-1. O 


sen(f og) 


Iteration of 9.31 shows that 
k 
sen(fio-++0 fr) = [ [ sen(fi). (9.2) 
i=1 


9.32. Theorem: Factorizations into Transpositions. Let f = t; otg0---ot, be any 
factorization of f € S,, into a product of transpositions (not necessarily basic ones). Then 
sen(f) = (—1)*. In particular, the parity of k (odd or even) is uniquely determined by /f. 


Proof. By 9.28, sgn(t;) = —1 for all 7. The conclusion now follows by setting f; = t; 
in (9.2). Oo 


9.33. Theorem: Sign of a k-cycle. The sign of any k-cycle (i1, i2,...,i) is (—1)*7!. 
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Proof. The result is already known for k = 1 and k = 2. For k > 2, one may check that the 
given k-cycle can be written as the following product of k — 1 transpositions: 
(41, 42,-.-,%k) = (41, 72) © (42, 73) © (43, t4) 0+ + (te-1, te). 


So the result follows from 9.32. O 


We can now show that the sign of a permutation f is completely determined by type(f). 
9.34. Theorem: Cycle Type and Sign. Suppose f € S,, has type(f) = wu. Then 


eu) 
sgn(f) = [co = eo amas = (ay oer, 
i=1 
Proof. Let the cycle decomposition of f be f = Cj0---0 Cou) where C; is a pij-cycle. The 
result follows from the relations sgn(f) = mi sen(C;) and sgn(C;) = (—1)#—1. g 


9.35. Example. The permutation f = (4,6,2,8)(3,9,1)(5,10,7) in Sio has sgn(f) = 
(—1)10-3 = -1. 


SSS 


9.5 Determinants 


In the next three sections, we interrupt our exposition of group theory to give an application 
of the preceding material to determinants. We will see that the combinatorial properties of 
permutations underlie many commonly used facts about determinants. 


9.36. Definition: Matrix Rings. For every commutative ring R and positive integer n, 
let M,(R) be the set of n x n matrices with entries in R. Formally, an element of M,,(R) 
is a function A: {1,2,...,n} x {1,2,...,n} — R. A(i,7) is called the i, j-entry of A. 
We often display A as a square array in which A(i,7) appears in row 7 and column j. For 
A, Be M,(R) and c€ R, define A+ B, AB, and cA by setting 


(A+ B)G,j) = Ali,j) + B,J); 
(AB)(i,j) = D> AG,A)BEE, 3); 
k=1 
(cA)(,j) = c(Ali,j)) UsSt7 <n). 


Routine verifications (see 2.151) show that M,(R) with these operations is a ring, whose 
multiplicative identity element [,, is given by I, (i,j) = lr if i = J, and I,(7,7) = Op if 
i # J. One also checks that M,,(R) is non-commutative if n > 1 and R # {0}. 


9.37. Definition: Determinants. For a matrix A € M,,(R), the determinant of A is 


n 


det(A) = S~ sgn(w) |] A(é, w(@)) € R. 


WESn i=l 


9.38. Example. When n = 1, det(A) = A(1,1). When n = 2, the possible permutations w 
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(in one-line form) are w = 12 with sgn(w) = +1, and w = 21 with sgn(w) = —1. Therefore, 
the definition gives 


A(l,1) A(1L, 


det(A) det ripe 05 = 4A(1, 1)A(2,2) — A(1, 2) A(2, 1). 


) 


A(1,1) A(1,2) A(1,3) 
det(A) = det | A(2,1) A(2,2) A(2,3) 
A(3,1) A(3,2) A(3,3) 

= +A(1,1)A(2,2)A(3, 3) — A(1,1)A(2, 3)A(3, 2) — A(1, 2) A(2, 1) A(3, 3) 

+A(1,2)A(2,3)A(3, 1) + A(1, 3)A(2, 1) A(3, 2) — A(1,3)A(2, 2) A(3, 1). 


In general, we see that det(A) is a sum of n! signed terms. A given term arises by choosing 
one factor A(z, w(z)) from each row of A; since w is a permutation, each of the chosen 
factors must come from a different column of A. The term in question is the product of 
the n chosen factors, times sgn(w). Since sgn(w) = (—1)'"Y™), the sign attached to this 
term depends on the parity of the number of basic transpositions needed to sort the column 
indices w(1), w(2),..., w(m) into increasing order. 


The next result shows that we can replace A(i, w(z)) by A(w(),7) in the defining for- 
mula for det(A). This corresponds to interchanging the roles of rows and columns in the 
description above. 


9.39. Definition: Transpose of a Matrix. Given A € M,,(R), the transpose of A is the 
matrix A’ € M,,(R) such that A'(i, 7) = A(j, 7) for all i,7 <n. 


9.40. Theorem: Determinant of a Transpose. For all A € M,,(R), det(A‘) = det(A). 
Proof. By definition, 


det(A") = S© sgn(w) ]] A‘(k, w(k)) = SY sen(w) [] A(w(A), fd. 


weSn k=1 weESn k=1 


For a fixed w € S;,, we make a change of variables in the product indexed by w by letting 
j =w(k), sok = w (J). Since w is a permutation and R is commutative, we have 


[[ 4@@),4) = [[ 4G, 2) 
k=1 


j=1 


because the second product contains the same factors as the first product in a different 
order (cf. 2.149). We now calculate 


det(A‘) = $0 sen(w) TT AG, wv") = sen(w-") T] AG, w""()). 


wEeESn 


Now consider the change of variable v = w~!. As w ranges over Sin, so does v, since w He wt 


is a bijection on S,,. Furthermore, we can reorder the terms of the sum since addition in R 
is commutative (see 2.149). We conclude that 


vESn j=l 
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Next we derive a formula for the determinant of an upper-triangular matrix. 


9.41. Theorem: Determinant of Triangular and Diagonal Matrices. Suppose A € 
M,,(R) satisfies A(i, j) = 0 whenever i > j. Then det(A) = []7_, A(i, i). Consequently, if A 
is either upper-triangular, lower-triangular, or diagonal, then det(A) is the product of the 
diagonal entries of A. 

Proof. By definition, det(A) = Y>,,¢5, sen(w) []j_, A(i, w(@)). In order for a given sum- 
mand to be nonzero, we must have i < w(t) for all i < n. Since w is a permutation, we 
successively deduce that w(n) =n, w(n-1) =n-1,..., w(1) = 1. Thus, the only possibly 
nonzero summand comes from w = id. Since sgn(id) = +1 and id(i) = 7 for all 7, the stated 
formula for det(A) follows when A is upper-triangular. The result for lower-triangular A 
follows by considering A‘. Since diagonal matrices are upper-triangular, the proof is com- 
plete. oO 


9.42. Corollary: Determinant of Identity Matrix. For all n € Nt, det(In) = 1p. 


Fe 
9.6 Multilinearity and Laplace Expansions 
This section continues our development of the properties of determinants. 


9.43. Definition: R-Linear Maps. Let R be a commutative ring and n € Nt. A map 
T:R" = Ris called R-linear iff T(v+z) =T(v)+T(z) and T(cv) = cT(v) for all v, z € R” 
and allce R. 


9.44. Example. Suppose };,...,b, € R are fixed constants, and T : R" — R is defined by 
T(v1, seey Un) = by v1 + bove fee bnUn- 


It is routine to check that the map T is R-linear. Conversely, one can show that every 
R-linear map from R” to R must be of this form. 


9.45. Theorem: Multilinearity of Determinants. Let A € M,,(R), and let k <n be 
a fixed row index. For every row vector v € R”, let A[v] denote the matrix A with row k 
replaced by v. Then the map T': R” — R given by T(v) = det(A[v]) is R-linear. A similar 
result holds for the columns of A. 


Proof. By 9.44, it suffices to show that there exist constants b),...,b, € R such that for 
all v = (v1, v2,---,;Un) € R”, 


T(v) = bv, + bovg +--+ + ban. (9.3) 


To establish this, consider the defining formula for det(Al[v]): 
T(v) = det(Afe]) = >> sen(w) |] Ale) (i, w(i)) = YO sen(w) | TT AG w@)] erway. 


The terms in brackets depend only on the fixed matrix A, not on v. So (9.3) holds with 


b= Se sen(w) [] AG, w@) (l<j<n). (9.4) 
Pes Zk 
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To obtain the multilinearity result for the columns of A, apply the result just proved to 
A’. O 


We sometimes use the following notation when invoking the multilinearity of determi- 
nants. For A € M,,(R), let Ai, A2,..., A, denote the n rows of A; thus each A; lies in R”. 
We write det(A) = det(Ai,..., An), viewing the determinant as a function of n arguments 
(row vectors). The previous result says that if we fix any n — 1 of these arguments and let 
the other one vary, the resulting map v +> det(Ai,...,v,..., An) (for v € R”) is R-linear. 


9.46. Theorem: Alternating Property of Determinants. If A © M,(R) has two 
equal rows or two equal columns, then det(A) = 0. 


Proof. Recall det(A) is a sum of n! signed terms of the form T(w) = sgn(w) []_, A(i, w(2)), 
where w ranges over S,,. Suppose rows r and s of A are equal, so A(r,k) = A(s,k) for all 
k. We will define a involution I on S;, with no fixed points such that T([(w)) = —T(w) 
for all w € S,. It will follow that the n! terms cancel in pairs, so that det(A) = 0. Define 
I(w) = wo(r,s) for w € S;,; evidently Io I =idg, and J has no fixed points. On one hand, 
sgn(I(w)) = sgn(w) - sgn((r, s)) = —sgn(w) by 9.31. On the other hand, 


[4G el s]@) = Alr,w(s))Als,w(r)) TT AG w@) 


i=1 iAr,s 
= A(r,w(r))A(s,w(s)) [] AG wv) = [TAG w(@). 
iAr,s i=1 
Combining these facts, we see that T([(w)) = —T(w), as desired. If A has two equal 
columns, then A‘ has two equal rows, so det(A) = det(A‘) = 0. O 


9.47. Theorem: Effect of Elementary Row Operations on Determinants. Let 
A€é M,,(R), let j,k be distinct indices, and let c € R. 

(a) If B is obtained from A by multiplying row j by c, then det(B) = cdet(A). 

(b) If B is obtained from A by interchanging rows j and k, then det(B) = — det(A). 

(c) If B is obtained from A by adding c times row 7 to row k, then det(B) = det(A). 
Analogous results hold for elementary column operations. 


Proof. Part (a) is a special case of the multilinearity of determinants (see 9.45). Part (b) 
is a consequence of multilinearity and the alternating property. Specifically, define T : 
R" x R” = R by letting T(v, w) = det(Ai,...,v,...,w,..., An) (where the v and w occur 
in positions j and k). Since det is multilinear and alternating, we get 


0=T(v+wu,vt+w) = T(v,v)+T (uw, v) +T (v, w)+T (uw, w) = T(w,v)+T(v,w) (v,w € R”). 


Thus, T(w, v) = —T(v, w) for all v, w, which translates to statement (b) after taking v = A; 
and w = Ax. Part (c) follows for similar reasons, since 


T(v,cu+w) =cT(v,v)+T(v,w) =T(v, w). O 
9.48. Theorem: Laplace Expansions of Determinants. For A € M,(R) andi,j <n, 
let A[i|j] be the matrix in M,_1(R) obtained by deleting row i and column j of A. For 
1<k<n, we have 


det(A) = (—1)'** A(i, k) det(A[i|k]) (expansion along column k) 


M: 


© 
Il 
un 


(—1))+* A(k, 7) det(A[k|j]) (expansion along row k). 


i 
lege 


1 


&. 
Il 
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Proof. Let us first prove the Laplace expansion formula along row k = n. By the proof of 
multilinearity (see equations (9.3) and (9.4)), we know that 


det(A) = 6 A(n, 1) + b2A(n, 2) +--+ + bn A(n, n) 


where 
by) = S> sen(w) T] AG, wl) (<j <n). 
weSn i=l 


w(n)=j 
Comparing to the desired formula, we need only show that 6; = (—1)/+" det(A[n|j]) for all 
j. 

Fix an index j. Let 5,,; = {w € 5, : w(n) = 7}. We define a bijection f : Sn; > Sn—1 
as follows. Every w € S,,; can be written in one-line form as w = wywe2-+:Wn—1Wn where 
Wn = j. Define f(w) = wiw---w),_, where wi) = uw; if uw, < 7, and w, = w; — 1 if uw; > 7. 
In other words, we drop the j at the end of w and decrement all letters larger than 7. 
The inverse map increments all letters > j7 and then adds a j at the end. Observe that 
the deletion of j decreases inv(w) by n — 7 (the number of letters to the left of j that are 
greater than j), and the decrementing operation has no further effect on the inversion count. 
So, inv(f(w)) = inv(w) — (n — j) and sgn(f(w)) = (—1)4*" sgn(w). We also note that for 
w’ = f(w), we have A(i, w(z)) = Al[n|j](2, w’(2)) for all i < n, since all columns in A after 
column 7 get shifted one column left when column j is deleted. Now use the bijection f to 
change the summation variable in the formula for b;. Writing w’ = f(w), we obtain 


BS DS ent) [] AGO) 
WES nj i=1 


= YS a sgn(w YT] ab iw!) = (-1)™ det(A[n|s)). 


w'ESn-1 


The Laplace expansion along an arbitrary row k follows from the special case k = n. Given 
k, let B be the matrix obtained from A by successively interchanging row k with row k+1, 
k+2,..., n. These n — k row interchanges multiply the determinant by (—1)"~*. It is 
evident that B(n,7) = A(k,j) and B[n|j] = A[k|j] for all 7. So 


n 


det(A) = (—1)"~* det(B) jag Dr 1))*” B(n, 7) det(B[n|j]) 
= SoG 1)** A(k, j) det(A[k|j]). 


Finally, to derive the Laplace expansion along column k, pass to transposes: 


det(A) = det(A") = S°(-1)/t* A'(k, 7) det(A"[k|j]) 


j=l 
S(-1)7** AG, k) det(Aly|A]). O 
j=l 
We now use Laplace expansions to derive the classical formula for the inverse of a matrix. 


9.49. Definition: Classical Adjoint of a Matrix. Given A € M,,(R), let adj A € M,(R) 
be the matrix with i, j-entry (—1)’*Y det(A[j|7]) for i,7 <n. 
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The next result explains why we wrote A[j|2] instead of Ali|j] in the preceding definition. 


9.50. Theorem: Adjoint Formula. For all A € M,,(R), we have 
A(adj A) = (det(A))In = (adj A)A. 


Proof. For 1<i<n, the i,i-entry of the product A(adj A) is 


n 


Sula k)[adj A](k, i) = S“(—1)'** A(i, k) det(A[i|k]) = det(A), 


k=1 k=1 
by Laplace expansion along row i of A. Now suppose i # 7. The i, j-entry of A(adj A) is 


n 


S > A(i, k)[adj A](k, 9) = S>(-1)7** A(i, k) det( Aly). 


k=1 


Let C be the matrix obtained from A by replacing row j of A by row i of A. Then C(j, k) = 
A(i,k) and Clj|k] = Aly|k] for all &. So the preceding expression is the Laplace expansion 
for det(C) along row j. On the other hand, det(C) = 0 because C' has two equal rows. So 
[A(adj A)](i, 7) = 0. We have proved that A(adj A) is a diagonal matrix with all diagonal 
entries equal to det(A), as desired. The analogous result for (adj A)A is proved similarly, 
using column expansions. Oo 


9.51. Corollary: Formula for the Inverse of a Matrix. If A © M,,(R) and det(A) is 
an invertible element of R, then the matrix A is invertible in M,,(2) with inverse 


1 
AT 1 _ 
ae 


9.52. Remark. Conversely, if A is invertible in M,,(R), then det(A) is an invertible element 
of R. The proof uses the following product formula for determinants: 


det(AB) = det(A) det(B) = det(B)det(A) (A, B € M,,(R)). 


Taking B = A™!, the left side becomes det(I,) = 1p, so det(B) is a two-sided inverse of 
det(A) in R. We will deduce the product formula as a consequence of the Cauchy-Binet 
formula, which is proved in the next section. 


DS 


9.7 Cauchy-Binet Formula 


This section discusses the Cauchy-Binet formula, which expresses the determinant of a 
product of rectangular matrices as the sum of a product of determinants of suitable sub- 
matrices. The proof of this formula is a nice application of the properties of inversions and 
determinants. 

To state the Cauchy-Binet formula, we need the following notation. Given a c x d matrix 
M, write M; for the ith row of M and M? for the jth column of M. Given indices j1,..., jc € 
{1,2,...,d}, let (M%,...,M%) denote the c x c matrix whose columns are M/!,..., M4 
in this eden Similarly, (M. iny+-+>Mj,) is the matrix whose rows are M;,,..., Mi, in this 
order. 
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9.53. Theorem: Cauchy-Binet Formula. Suppose m < n, A is an m x n matrix, 
and B is an n xX m matrix. Let J be the set of all lists 7 = (j1,J2,---,Jjm) such that 
L< ji <jo<-++<jm <n. Then 
det(AB) = 5° det(A”, A?,..., Al) det(B;,, Bjy,..-, Bin)- 
JET 
Proof. Note that all matrices appearing in the formula are m x m, so all the determinants 


are defined. We begin by using the definitions of matrix products and determinants (§9.5) 
to write 


det(AB) = S° sgn(w) [][ (AB), w(d)) = S > sen(w) T] So AG, ki) Bhi, w(@). 
weSm t=1 weESm t=1kj=1 


The generalized distributive law (§2.1) changes the product of sums into a sum of products: 


det(AB) = SS Be SS sen(w) II A(i, ki) II Blk, w(t)) 
wESm k=l km=l j=l i=l 


Let K be the set of all lists k = (k1,...,km) with every kj € {1,2,...,n}, and let K’ be the 
set of lists in K whose entries k; are distinct. We can combine the m separate sums over 
the k,’s into a single sum over lists k € kK. We can also reorder the summation to get 


det (AB) =>- S- sen(w (w) [TAG b) TT BGs, w@) 


keEK wESm 


Next, factor out quantities that do not depend on w: 


det(AB) = [aey ies sen(w) |] Bk, wi) 


ke K i=1 we Sm 
The term in brackets is the defining formula for det(By,,,..., Bx,,). If any two entries in 
(k1,..-, km) are equal, this matrix has two equal rows, so its determinant is zero. Discarding 


these terms, we are reduced to summing over lists k € K’. So now we have 


det(AB) = 5° [L4@e i) det(Br,,---; Bry): 


ke K’i=1 


To continue, observe that for every list k € K’ there exists a unique sorted list 7 € J with 
j = sort(k). Grouping summands gives 


det(AB)=S>  S- [L4@e {) det(Br,,-.-, Brn): 


JEJ kek’ t=1 
sort(k)=j 
Given that sort(k) = j, we can change the matrix (By,,...,Bx,,) into the matrix 
(B;,,...,B;,,) by repeatedly switching adjacent rows. Each such switch flips the sign of the 
determinant, and the number of row switches required is readily seen to be inv(k, kg--- km). 
(To see this, adapt the proof of 9.29 to the case where the objects being sorted are 
{ji < jo <+++ < jm} instead of {1 <2 <--- < m}; cf. 9.179.) Letting sgn(k) = (—1)"™), 
we can therefore write 


det(AB)=S°> S> sen(k ®II A(i, ki) det(B;,,..-, Bj,,)- 


JEJ kek’ i=1 
sort(k)=j 
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The determinant in this formula depends only on j, not on k, so it can be brought out of 
the inner summation: 
m 
det(AB) = N° det(Bj,,...,B;,,) = sgn(k) ][ AG, ki). 
jEJ kek’ i=1 
sort(k)=j 

To finish, note that every k € K’ that sorts to j can be written as (ki,...,km) = 
(Ju(1)++++sJu(m)) for a uniquely determined permutation v € S,,. Since j is an increas- 
ing sequence, it follows that inv(k&) = inv(v) and sgn(k) = sgn(v). Changing variables in 
the inner summation, we get 


m 


det(AB) = 5° det(B;,,...,Bj,,) | 5° sen(v) [] AG jw) 
jed vESm i=l 
The term in brackets is none other than det(A?!,..., A%”), so the proof is complete. oO 


9.54. Theorem: Product Formula for Determinants. If A and B are m xm matrices, 
then det(AB) = det(A) det(B). 


Proof. Take n =m in the Cauchy-Binet formula. The index set J consists of the single list 
(1,2,...,m), and the summand corresponding to this list reduces to det(A) det(B). Oo 


Other examples of combinatorial proofs of determinant formulas appear in §11.14 and 
812.9. 


DT 


9.8 Subgroups 


Suppose (G,x) is a group, and H is a subset of G. One might hope that (H,x’) is also a 
group, where x’ is the restriction of * to H x H. This is not true in general, but it will be 
true if H is a subgroup. 


9.55. Definition: Subgroups. Let (G,«) be a group and let H be a subset of G. H is 
called a subgroup of G, written H < G, iff the following three “closure conditions” are 
satisfied: 


egeH (closure under identity); 
Va,b€ H,axb€H (closure under the operation); 
Vae H,a teH (closure under inverses). 


A subgroup 4 is called normal in G, written H (IG, iff 
Vg €G,VhE H,ghg 'CH (closure under conjugation). 


Let us verify that (H,x’) is indeed a group when H < G. Since H is closed under the 
operation, x’ does map H x H into H (not just into G), so the closure axiom holds for 
(H, x’). Since H is a subset of G, associativity holds in H because it is known to hold in G. 
The identity e of G lies in H by assumption. Since exh = h = hxe holds for all h € G, 
the relation e x’ h = h = hx’ e certainly holds for all h € H C G. Finally, every element 
x of H has an inverse y (relative to x) that lies in H, by assumption. Now y is still an 
inverse of x relative to *’, so the proof is complete. One also sees that H is commutative if 
G is commutative, but the converse statement is not always true. Usually, we use the same 
symbol x (instead of *’) to denote the operation in the subgroup H. 
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9.56. Example. We have the following chain of subgroups of the additive group (C, +): 
{0} <Z<Q<R<C. 


Similarly, {—1,1} and Qt are both subgroups of (Q ~ {0}, x). The set {0,3,6,9} is a 
subgroup of (Zi2,®); one can prove closure under addition and inverses by a finite case 
analysis, or by inspection of the relevant portion of the addition table for Zj2. 


9.57. Example. The sets H = {(1)(2)(3), (1, 2,3), (1,3, 2)} and kK = {(1)(2)(3), (1,3)} are 
subgroups of S3, as one readily verifies. Moreover, H is normal in $3, but K is not. The set 
J = {(1)(2)(3), (1,3), (2,3), (1, 3)} is not a subgroup of $3, since closure under the operation 
fails: (1,3) o (2,3) = (1,3,2) ¢ J. Here is a four-element normal subgroup of 54: 


V = {(1)(2)(3)(4), (1, 2)(3, 4), (1, 3)(2, 4), (1, 4)(2, 3) f- 


Each element of V is its own inverse, and one confirms closure of V under the operation by 
checking all possible products. To prove the normality of V in $4, it is helpful to use 9.131 
below. 


9.58. Example. The set of even integers is a subgroup of (Z, +). For, the identity element 
zero is even; the sum of two even integers is again even; and x even implies —x is even. More 
generally, let k be any fixed integer, and let H = {kn : n € Z} consist of all integer multiples 
of k. A routine verification shows that H is a subgroup of (Z,+). We write H = kZ for 
brevity. The next theorem shows that we have found all the subgroups of the additive group 
Z. 


9.59. Theorem: Subgroups of Z. Every subgroup H of (Z,+) has the form kZ for a 
unique integer k > 0. 


Proof. We have noted that all the subsets kZ are indeed subgroups. Given an arbitrary 
subgroup H, consider two cases. If H = {0}, then H = 0Z. Otherwise, H contains at least 
one nonzero integer m. If m is negative, then —m € H since 4H is closed under inverses. 
So, H contains strictly positive integers. Take k to be the least positive integer in H. We 
claim that H = kZ. Let us prove that kn € A for all n € Z, so that kZ C H. For n > 0, 
we argue by induction on n. When n = 0, we must prove k0 = 0 € HA, which holds since 
HAT contains the identity of Z. When n = 1, we must prove kl = k € H, which is true by 
choice of k. Assume n > 1 and kn € H. Then k(n+1) = kn+k © H since kn € A, 
k € H, and GH is closed under addition. Finally, for negative n, write n = —m and note that 
kn = —(km) € H since km € H and H is closed under inverses. 

The key step is to prove the reverse inclusion H C kZ. Fix z © H. Dividing z by k, 
we obtain z = kq+r for some integers g,r with 0 < r < k. By what we proved in the 
last paragraph, k(—q) € H. So, r = z—kq = z+ k(—q) © H since H is closed under 
addition. Now, since k is the least positive integer in H, we cannot have 0 < r < k. The 
only possibility left is r = 0, so z = kq € kZ, as desired. 

Finally, to prove uniqueness, suppose kZ = mZ for k,m > 0. Note k = 0 iff m = 0, so 
assume k,m > 0. Since k € kZ = mZ, k is a multiple of m. Similarly, m is a multiple of k. 
As both k& and m are positive, this forces k = m, completing the proof. oO 


How can we find subgroups of a given group G? As we see next, each element x € G 
gives rise to a subgroup of G in a natural way. 


9.60. Definition: Cyclic Subgroups and Cyclic Groups. Let G be a group written 
multiplicatively, and let x € G. The cyclic subgroup of G generated by x is (a) = {a" :n€ 
Z}. One sees, using the laws of exponents, that this subset of G really is a subgroup. G is 
called a cyclic group iff there exists x € G with G = (x). When G is written additively, we 
have (x) = {nz:neéZ}. 
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9.61. Example. The group (Z, +) is cyclic, since Z = (1) = (—1). The subgroups kZ = (k) 
considered above are cyclic subgroups of Z. Our last theorem implies that every subgroup of 
Z is cyclic. The groups (Z,, @) are also cyclic; each of these groups is generated by 1. The 
group ({a, b, c,d}, x) discussed at the end of 9.7 is not cyclic. To prove this, we compute all 
the cyclic subgroups of this group: 


(a) = {a}, (b) = {a,b}, (c) = {a,c}, (d) = {a,d}. 


None of the cyclic subgroups equals the whole group, so the group is not cyclic. For a bigger 
example of a non-cyclic group, consider (Q,+). Any nonzero cyclic subgroup has the form 
(a/b) for some positive rational number a/b. One may check that a/2b does not lie in this 
subgroup, so Q 4 (a/b). Noncommutative groups furnish additional examples of non-cyclic 
groups, as the next result shows. 


9.62. Theorem: Cyclic Groups are Commutative. 


Proof. Let G = (x) be cyclic. Given y,z € G, we can write y = «” and z = x™ for some 
n,m € Z. Since integer addition is commutative, the laws of exponents give 


SQ GN GI i ht pl Re Sa O 


By adapting the argument in 9.59, one can show that every subgroup of a cyclic group 
is cyclic; we leave this as an exercise for the reader. 


9.63. Example. The cyclic group Ze has the following cyclic subgroups (which are all the 
subgroups of this group): 


(0) = {0}; (1) = {0, 1, 2,3, 4, 5} = (5); (2) = {0, 2, 4} = (4); (3) = {0, 3}. 
In the group S4, we have 


(1,3, 4,2)) = {(1,3,4,2), (1,4)(3, 2), (1, 2,4,3), (1)(2)(3)(4)}. 


|| 


9.9 Automorphism Groups of Graphs 


This section uses graphs to construct examples of subgroups of symmetric groups. These 
subgroups will be used later when we discuss applications of group theory to counting 
problems. 


9.64. Definition: Automorphism Group of a Graph. Let K be a simple graph with 
vertex set X and edge set E. A graph automorphism of K is a bijection f : X — X such 
that, for allu A v in X, {u,v} © E iff {f(u), f(v)} © E. Let Aut(ic) denote the set of 
all graph automorphisms of K. Analogous definitions are made for directed simple graphs; 
here, the requirement on f is that (u,v) € E iff (f(u), f(v)) € E for all uve X. 


One verifies immediately from the definition that Aut(K) < (Sym(X),o). Thus, auto- 
morphism groups of graphs are subgroups of symmetric groups. 


9.65. Example. Consider the graphs shown in Figure 9.2. The undirected cycle Cs has 
exactly ten automorphisms. They are given in one-line form in the following list: 


[12345], [23451], [84512], [45123], (51234, 
[54321], [43215], [32154], [21543], (15432). 
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FIGURE 9.2 
Graphs used to illustrate automorphism groups. 


The same automorphisms, written in cycle notation, look like this: 


(AQ) AG) Cys 4ay, pa.) Cae 8). Ciaays) 2), 
(1,5)(2,4)(3),  1,4)(2,3)(5), (1,3)(4,5)2), (1, 2)(3,5)(4), (2,5) (8,4) (1). 


Geometrically, we can think of C's as a necklace with five beads. The first five automorphisms 
on each list arise by rotating the necklace through various angles (rotation by zero is the 
identity map). The next five automorphisms arise by reflecting the necklace in five possible 
axes of symmetry. 

Now consider the automorphism group of the directed cycle D5. Every automorphism 
of the directed graph Ds; is automatically an automorphism of the associated undirected 
graph Cs, so Aut(Ds) < Aut(Cs). However, not every automorphism of Cs is an automor- 
phism of Ds. In this example, the five “rotations” preserve the direction of the edges, hence 
are automorphisms of D5. But the five “reflections” reverse the direction of the edges, so 
these are not elements of Aut(D;). We can write Aut(Ds) = ((1,2,3,4,5)), so that this 
automorphism group is cyclic of size 5. 

The 6-cycle Cg can be analyzed in a similar way. The automorphism group consists of 
six “rotations” and six “reflections,” which are given in cycle form below: 


(1)(2)(3)(4)(5)(6), (1, 2,8,4,5,6), (1, 3,5)(2,4,6), (1,4) (2, 5)(3, 6), 
(1,5, 3)(2, 6,4), (1,6,5,4,3,2), (2,6)(3,5)(1)(4), (1, 2) (8, 6) (4, 5), 
(1, 3)(4,6)(2)(5), (2, 3)(5,6)(1, 4), (1, 5)(2, 4)(3)(6), (1, 6)(2, 5)(3, 4). 


The observations in the previous example generalize as follows. 


9.66. Theorem: Automorphism Group of a Cycle. For n > 3, let C, be the graph 
with vertex set X = {1,2,...,n} and edge set F = {{i,i +1}: 1<i<n}U{{1,n}}. Then 
Aut(C;,,) is a subgroup of S$, of size 2n. The elements of this group (in one-line form) are 
the n permutations 


fi, ¢+1, 74+2,... n, 1, 2,...%-1] (l<i<n) (9.5) 
together with the reversals of these n words. 


Proof. It is routine to check that all of the displayed permutations do preserve the edges 
of C,, hence are automorphisms of this graph. We must show that these are the only 
automorphisms of C,,. Let g be any automorphism of C,,, and put 7 = g(1). Now, since 1 
and 2 are adjacent in C;,, g(1) and g(2) must also be adjacent in C;,. There are two cases: 
g(2) =i +1 or g(2) =i—1 (reading values mod n). Suppose the first case occurs. Since 2 
is adjacent to 3, we must have g(3) = i+ 2 or g(3) =7. But i = g(1) and g is injective, so it 
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FIGURE 9.3 
Graph representing a 5 x 5 chessboard. 


must be that g(3) = i+ 2. Continuing around the cycle in this way, we see that g must be 
one of the permutations displayed in (9.5). Similarly, in the case where g(2) = i — 1, we see 
that g(3) =7— 2, etc., and g must be the reversal of one of the permutations in (9.5). O 


The reasoning used in the preceding proof can be adapted to determine the automor- 
phism groups of more complicated graphs. 


9.67. Example. Consider the graph B displayed in Figure 9.3, which models a 5 x 5 
chessboard. What are the automorphisms of B? We note that B has four vertices of degree 
2: a, e, uv, and z. An automorphism ¢ of B must restrict to give a permutation of these four 
vertices, since automorphisms preserve degree. Suppose, for example, that (a) = v. What 
can ¢(b) be in this situation? Evidently, ¢(b) must be q or w. In the former case, ¢(c) = k 
is forced by degree considerations, whereas $(c) = x is forced in the latter case. Continuing 
around the “edge” of the graph, we see that the action of ¢ on all of the “border” vertices 
is completely determined by where a and 6 go. A tedious but routine argument then shows 
that the images of the remaining vertices are also forced. Since a can map to one of the 
four corners, and then b can map to one of the two neighbors of ¢(a), there are at most 
4 x 2 =8 automorphisms of B. Here are the eight possibilities in cycle form: 


To = (a)(b)(c)--- (x) (y)(z) = id; 

rr = (4,€,2,v)(,3,y, Q)(c Dp, 2, k)(d, u, w, f)(9,%,t,r)(h,n, 8, 1)(m); 

re (a, z)(b, y)(c, x)(d, w)(e, v)(9, 2) (p, k)(u, f(g, t)(A, 8) (2, 7) (n, 1) (m); 
rs = (a,v,2,e)(,qy, J) (ck, x, p)(d, f,w, u)(g,r, t, 7) (A, 1, s,n)(m); 

Sy-0 = (a,v)(b,w)(c,2)(d,y)(e,2)(f, (9, 7)(h, 8) (4, (9, w)(k))(m)(n)(p); 
Sra0 = (a, €)(b,d)(f,9)(9, (kK, p)(L, n)(G, w)(r, t)(v, z)(w, y)(e)(h)(m)(s) (x); 
Syax = (a, 2z)(b,u)le,p)(GA)(f, yg, H(A, n)\(k, x)(1, s)(q, w)(e)(@)(m)(r)(v); 
Sy=-e = (6, f)(e kd Dle,v)(h, DG 7), w)(n, s)(p, x) (u, y)(a)(g)(m) (t)(z). 


One may check that all of these maps really are automorphisms of B, so | Aut(B)| = 8. The 
reader will perceive that this graph has essentially the same symmetries as C4: four rotations 
and four reflections. (The subscripts of the reflections indicate the axis of reflection, taking 
m to be located at the origin.) By directing the edges in a suitable way, we could produce 
a graph with only four automorphisms (the rotations). These graphs and groups will play 
a crucial role in solving the chessboard-coloring problem mentioned in the introduction to 
this chapter. 
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FIGURE 9.4 
The cube graph. 


9.68. Example. As a final illustration of the calculation of an automorphism group, con- 
sider the graph C shown in Figure 9.4, which models a three-dimensional cube. We have 
taken the vertex set of C to be {0, 1}%, the set of binary words of length 3. Which bijections 
f : {0,1}3 — {0,1}°% might be automorphisms of C? First, f(000) can be any of the eight 
vertices. Next, the three neighbors of 000 (namely 001, 010, and 100) can be mapped bijec- 
tively onto the three neighbors of f (000) in any of 3! = 6 ways. The images of the remaining 
four vertices are now uniquely determined, as one may check. By the product rule, there are 
at most 8 x 6 = 48 automorphisms of C. A routine but tedious verification shows that all of 
these potential automorphisms really are automorphisms, so | Aut(C)| = 48. The geomet- 
rically inclined reader may like to visualize these automorphisms as arising from suitable 
rotations and reflections in three-dimensional space. Here are the six automorphisms of C 
that send 000 to 110: 


i 000 001 O10 O11 100 101 110 111 
'\ 110 100 111 101 010 000 011 001 }’ 


ies 000 001 010 011 100 101 110 111 
2~\ 110 100 010 000 111 101 011 001 


000 001 O10 O11 100 101 110 111 
110 111 100 101 O10 O11 O00 OO1 


jee 000 001 010 O11 100 101 110 111 
4~\ 110 111 010 011 100 101 000 001 


000 001 O10 O11 100 101 110 111 
110 010 100 O00 111 O11 101 OO1 


000 001 O10 O11 100 101 110 111 
110 010 111 O11 100 O00 101 OO1 


9.10 Group Homomorphisms 


9.69. Definition: Group Homomorphisms. Let (Gx) and (H,e) be groups. A function 
f :G-— GF is called a group homomorphism iff 


f(axy) = f(x)e f(y) for allz,y EG. 


A group isomorphism is a bijective group homomorphism. 


342 Bijective Combinatorics 


9.70. Example. Define f : R — R* by f(x) = e®. This function is a group homomorphism 
from (R,+) to (R*, x), since f(7 + y) = e"t¥ = e* x e¥ = f(x) x f(y) for all z,y € R. In 
fact, f is a group isomorphism since g : Rt > R given by g(x) = Inz is a two-sided inverse 
for f. 


9.71. Example. Define h : C — R by A(x + ty) = a for all x,y € R. One checks that h 
is a group homomorphism from (C,+) to (R,+) that is surjective but not injective. Next, 
define r: C ~ {0} — R~ {0} by setting r(a + iy) = |a + ty] = x? + y?. Given nonzero 
w=ax+iy and z=u-+iv, we calculate 


r(wz) = r((au—yv)+i(yu+cv)) = V (cu — yo)? + (yut+ av)? 
= V(2? + y?)(u* + v7) = r(w)r(z). 


So r is a homomorphism of multiplicative groups. 


9.72. Example. For any group G, the identity map idg : G — G is a group isomorphism. 
More generally, if H < G, then the inclusion map j : H — G given by j(h) = h for h ¢ H 
is a group homomorphism. If f : G— K and g: K — P are group homomorphisms, then 
go f:G— P isa group homomorphism, since 


(go f)(xy) = 9(f(zy)) = oF (2) FY) = o(fla)afy) =(gof\(z\(gof\y) (ay eG). 


Moreover, go f is an isomorphism if f and g are isomorphisms, since the composition of 
bijections is a bijection. If f : G — K is an isomorphism, then f~! : K — G is also an 
isomorphism. For suppose u,v € K. Write x = f~'(u) and y = f~+(v), sou = f(x) and 
v = f(y). Since f is a group homomorphism, it follows that uv = f(xy). Applying f~! to 
this relation, we get f~'(uv) = xy = f~'(u)f7'(v). 


9.73. Definition: Automorphism Groups. Let (G,x) be a group. An automorphism of 
G is a group isomorphism f : G > G. Let Aut(G) denote the set of all such automorphisms. 


The remarks in the preceding example (with kK = P = G) show that Aut(G) is a 
subgroup of (Sym(G),°). 


9.74. Example: Inner Automorphisms. Let G be any group, and fix an element g € G. 
Define a map C, : G => G (called conjugation by g) by setting C(x) = gxrg~'. This 
map is a group homomorphism, since C,(xy) = g(xy)g~* = (gxg~')(gyg—') = Cg(x)Cy(y) 
for all x,y € G. Furthermore, Cy is a group isomorphism, since a calculation shows that 
C,-1 is the two-sided inverse of Cy. It follows that C, € Aut(G) for every g € G. We call 
automorphisms of the form C', inner automorphisms of G. It is possible for different group 
elements to induce the same inner automorphism of G. For example, if G is commutative, 
then C,(x) = gxg~' = gg~'x = @ for all g,z € G, so that all of the conjugation maps C, 
reduce to idg. 


9.75. Theorem: Properties of Group Homomorphisms. Let f :G-— H be a group 
homomorphism. For all n € Z and all x € G, f(x”) = f(x)”. In particular, f(eg) = ey and 
f(a~+) = f(x)~'. We say that f preserves powers, identities, and inverses. 


Proof. First we prove the result for all n > 0 by induction on n. When n = 0, we must 
prove that f(ec) = ex. Note that egeg = eg. Applying f to both sides of this equation 
gives 


flea) fea) = flecea) = flea) = flea)en. 


By left cancellation of f(eg) in H, we conclude that f(eg) = ex. For the induction step, 
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assume n > 0 and f(x") = f(x)"; we will prove f(x"*t+) = f(x)"*+. Using the definition of 
exponent notation, we calculate 


AO VSI Daf jae =fey ear a 


Next, let us prove the result when n = —1. Given x € G, apply f to the equation rz~* = eg 
to obtain 


f(a) f(a) = f(wa") = flea) = en = f(a) f(x). 
Left cancellation of f(x) gives f(z~!) = f(x)~!. Finally, consider an arbitrary negative 
integer n = —m, where m > 0. We have 


f(a") = f(@™)*) = f@™) t= (F@)™)* = F(a) = f(@)”. Oo 
We can use group homomorphisms to construct more examples of subgroups. 


9.76. Definition: Kernel and Image of a Homomorphism. Let f : G — H bea 
group homomorphism. The kernel of f, denoted ker(f), is the set of all « € G such that 
f(x) =ey. The image of f, denoted img(f), is the set of all y € H such that y = f(z) for 
some z € G. 


The reader may check that ker(f) < G and img(f) < H. 


9.77. Example. Consider the homomorphisms h and r from 9.71, given by h(x + iy) = x 
and r(z) = |z| for x,y € R and nonzero z € C. The kernel of h is the set of pure imaginary 
numbers {iy : y € R}. The kernel of r is the unit circle {z € C: |z| = 1}. The image of h is 
all of R, while the image of r is Rt. 


9.78. Example: Even Permutations. By 9.31, the function sgn : S$, — {+1,—1} isa 
group homomorphism. The kernel of this homomorphism, which is denoted A,,, consists of 
all f € S, such that sgen(f) = +1. Such f are called even permutations. Ay, is called the 
alternating group on n letters. We will see later (9.121) that |A,| = |S;,|/2 = n!/2 for all 
n> 2. 


9.79. Example: Analysis of Cyclic Subgroups. Let G be any group, written multi- 
plicatively, and fix an element x € G. Define f : Z — G by setting f(n) = x” for all n € Z. 
By the laws of exponents, f is a group homomorphism. The image of f is precisely (x), the 
cyclic subgroup of G generated by x. The kernel of f is some subgroup of Z, which by 9.59 
has the form mZ for some integer m > 0. Consider the case where m = 0. Then x’ = eg iff 
f(i) = ecg iff i € ker(f) iff i= 0, so x° is the only power of x that equals the identity of G. 


We say that x has infinite order in this case. We remark that i 4 j implies x 4 x’, since 
gi =a! a) 


x 


eg >i-j=0Si=j. 
In other words, all integer powers of x are distinct elements of G. This means that f : Z—= G 
is injective. So f induces a group isomorphism f’ : Z = (2). 

Now consider the case where m > 0. Then x = eg iff f(i) = eg iff i € ker(f) iffi isa 
multiple of m. We say that x has order m in this case; thus, the order of x is the least positive 
exponent i such that 2’ = eg. We claim that the cyclic group (x) consists of the m distinct 
elements x°, x!,x?,...,2™~+. For, given an arbitrary element x” € (x), we can divide n by 
m to get n = mq-+r for some r with 0 <r <_m. Then 2” =a" = (¢™){a" = eb” = 2", 
so x” is equal to one of the elements in our list. Furthermore, the listed elements are distinct. 
For suppose 0 < i <j < mand 2’ = 2. Then x~* = eg, forcing m to divide j — i. But 
0 < 7-—%< m, so the only possibility is 7 — 7 = 0, hence i = 7. Consider the function 
g: Zm — (x) given by g(i) = x for 0 < i < m. This function is a well-defined bijection 
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by the preceding remarks. Furthermore, g is a group homomorphism. To check this, let 
1,9 € Zm. Ifi+yg<m, then 


9407) =9G 49) =a = a'a? = g(d)9(5). 
Ifi+j>m, then 
g¢O9) =96 Fj Sm) HSE ae a") Har SoG)gG). 


So g is an isomorphism from (Z,,,,®) to the cyclic subgroup generated by z. 

We have just shown that every cyclic group (x) is isomorphic to one of the additive 
groups Z or Zm for some m > 0. The first case occurs when x has infinite order, and the 
second case occurs when x has order m. 


DE 


9.11 Group Actions 


The fundamental tool needed to solve counting problems involving symmetry is the notion 
of a group action. 


9.80. Definition: Group Actions. Suppose G is a group and X is a set. An action of 
G on X is a function *: Gx X — X satisfying the following axioms. 


1. For all g € Gand all x € X, gx a © X (closure). 
2. For all € X, eg * x = = (identity). 
3. For all g,h € G and all  € X, g* (h* x) = (gh) * & (associativity). 


The pair (X,*) is called a G-set. 


9.81. Example. For any set X, the group G = (Sym(X), 0) acts on X via the rule g* xz = 
g(x) for g € G and a € X. Axiom 1 holds because each g € G is a function from X to X, 
hence g * = g(x) € X for all x € X. Axiom 2 holds since eg * & = idx *4 = idx (x) = & 
for all x € X. Axiom 3 holds because 


g* (hex) = g(h(x)) = (goh)(a)=(gh)* — (g,h € Sym(X),# € X). 


9.82. Example. Let G be any group, written multiplicatively, and let X be the set G. 
Define «: Gx X — X by g* x = gx for all g,x € G. We say that “G acts on itself by 
left multiplication.” In this example, the action axioms reduce to the corresponding group 
axioms for G. 

We can define another action of G on X = G by letting ge = xg7! for all g,x € G. 
The first two axioms for an axiom are immediately verified; the third axiom follows from 
the calculation 


ge (hex) =ge(eh-') = (ah-)g-! = a(h-'g-) = a(gh)"! =(gh)ex (g,h,w € G), 


We say that “G acts on itself by inverted right multiplication.” One can check that the rule 
g: x = xg (for g,x € G) does not define a group action for non-commutative groups G, 
because axiom 3 fails. (But see the discussion of right group actions below.) 


9.83. Example. Let the group G act on the set X = G as follows: 
g*x=grg' (gE G,reE xX). 


We say that “G acts on itself by conjugation.” The reader should verify that the axioms for 
an action are satisfied. 
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9.84. Example. Suppose we are given a group action *: Gx X — X. Let H be any 
subgroup of G. By restricting the action function to H x X, we obtain an action of H on 
X, as one immediately verifies. Combining this construction with previous examples, we 
obtain quite a few additional instances of group actions. For example, any subgroup H of 
a group G acts on G by left multiplication, and by inverted right multiplication, and by 
conjugation. Any subgroup H of Sym(X) acts on X via fxa = f(x) for fe Handwe X. 
In particular, the automorphism group Aut(G) of a group G is a subgroup of Sym(G), so 
Aut(G) acts on G via f xa = f(x) for f € Aut(G) and « € G. Similarly, if K is a graph 
with vertex set X, then Aut(K) is a subgroup of Sym(X), and therefore Aut(J’) acts on 
X via fx«x= f(x) for f €¢ Aut(K) andae X. 


9.85. Example. Suppose (X, *) is a G-set. Let P(X) be the power set of X, which consists 
of all subsets of X. It is routine to check that P(X) is a G-set under the action 


geS={g*s:sES} (gEG,SEP(X)). 


9.86. Example. Consider a polynomial ring R = F[x1,22,...,%], where F' is a field 
(see $7.16). The symmetric group S, acts on {1,2,...,n} via f*i = f(i) for f © S, and 
1<i<n. We can transfer this to an action of S$, on {x1,...,%n} by defining 


f «2; = 270) (f € Sy, 1 <ti<n). 


Using the universal mapping property of polynomial rings (see 7.102), each bijection (a; > 
rp) : 1 < i < n) extends to a ring isomorphism f sending p = p(a1,...,%n) € R to 
f(p) = P(@ f(1)s-++52f(n))» One may check that the rule f *p = F(p) (for f € S, and p € R) 
defines an action of $,, on R. In particular, g * (h*p) = (goh) *p follows by the uniqueness 
part of the universal mapping property, since both sides are the image of p under the unique 
ring homomorphism sending x; to @g(p,(4)) for all 2. 


9.87. Example. By imitating ideas in the previous example, we can define certain group 
actions on vector spaces. Suppose V is a vector space over a field F' and let X = (x1,...,2n) 
be an ordered basis of V. For f € S,, the map x; +> xy) on basis vectors extends by 
linearity to a unique linear map Ty : V — V, given explicitly by 


Tp(a1%1 +--+ +Gndn) = G1 51) ++ ++ + On F(n) (a; € F). 


One may check that f *v = T;(v) (for f € S, and v € V) defines an action of the group 
S,, on the set V. 


9.88. Example. Suppose G is a group, (X, *) is a G-set, and W and Z are any sets. Recall 
that “X is the set of all functions F : W — X. This set of functions can be turned into a 
G-set by defining 


(ge F)(w)=g*(F(w)) (9€G,FeEX,weW). 


We leave the verification of the action axioms as an exercise. 
Now consider the set *Z of all functions F : X — Z. We claim this set of functions 
becomes a G-set if we define 


(ge F)(x) = F(g7! * a) (9€G,Fe€*Z,x2€ X). 


Let us carefully prove this claim. First, given g € G and F € *Z, the map ge F is a well- 
defined function from X to Z because g~! * 2 € X and F maps X into Z. So, ge F € *Z, 
verifying closure. Second, letting e be the identity of G and letting F € *Z, we have 


(ee F)(x) = F(e~| * x) = F(e*x x) = F(x) (x € X), 
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so that we have the equality of functions ee F = F’. (Recall that two functions are equal iff 
they have the same domain X, have the same codomain, and take the same values at each 
x € X.) Third, we verify the associativity axiom for e. Fix g,h € G and F € *Z. The two 
functions g e (he F’) and (gh) e F both have domain X and codomain Z. Fix « € X. On 
one hand, 

[(gh) ¢ F\(x) = F((gh)* * x) = F((h-*g~*) ¥ 2). 


On the other hand, using the definition of e twice, 
[9 ® (he F)\(2) = [he F\(g-? «2) = F(h-! + (g-? *2)), 


Since * is known to be an action, we see that ge (he F’) and (gh) e F take the same value 
at x. So the third action axiom is proved. The reader may check that this axiom would fail, 
in general, if we omitted the inverse in the definition of e. 


9.89. Example. Let n be a fixed integer, let Y be a set, and let 


U ={(y1,---5Yn) yi € YF 


be the set of all sequences of n elements of Y. The group $;, acts on U via the rule 


F (Yas Yar +25 Ym) = (Yp-31¢1)) Yp-2(2)9 + YP-U(n)) tf Gini Vineet stin eX). 


The inverses in this formula are essential. To see why, we observe that the action here is 
actually a special case of the previous example. For, a sequence in U is officially defined 
to be a function y : {1,2,...,n} — Y where y(t) = y;. Using this function notation for 
sequences, we have (for f € S,) 


(fF yN@=yS MO) =Fey@ (Us<isn), 


in agreement with the previous example. One should also note that acting by f moves the 
object z originally in position i to position f(i) in the new sequence. This is true because 
(f- w(F@) =F @)) = yd) =z. 

The reader may now be disturbed by the lack of inverses in the formula f * xj = 2 (;) 
from 9.87. However, there is no contradiction since the x;’s in the latter example are fixed 
basis elements in a vector space V, not the entries in a sequence. Indeed, recall that the 
action on V is given by f*v = Ty(v) where T} is the linear extension of the map x; +> @ fi). 
Writing v = >> , a 2;, the coordinates of v relative to this basis are the entries in the sequence 
(a1, @2,...,@,). Applying f to v gives 


bs UE Fi) = » pt dae 8 ag 
a J 
where we changed variables by letting j = f(i), i= f~1(j). We now see that the coordinates 
of f *v relative to the ordered basis (71,...,2%p) are (af-1(1),---,@f-1(m))- For example, 
(1, 2,3) * (a1a1 + aot + a3%3) = (a1%2 + a2X%3 + 03%1) = (a3%1 + a1%2 + a2%3), 
or equivalently, in coordinate notation, 
(1; 2, 3) * (a1, a2, a3) = (a3, a1, az). 


To summarize, when f acts directly on the objects x;, no inverse is needed; but when f 
permutes the positions in a list, one must apply f~! to each subscript. 
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9.90. Remark: Right Actions. A right action of a group G on a set X is a map * : 
XxG-— X such that «xe = x and «* (gh) = (a*g)*h for all x € X and all g,h € G. For 
example, x * g = xg (with no inverse) defines a right action of a group G on the set X =G. 
Similarly, we get a right action of S,, on the set of sequences in the previous example by 
writing 
(yi, ae 1 Yn) * Fi = (Yp(a)> Rape tiptas)> 

Group actions (as defined at the beginning of this section) are sometimes called left actions 
to avoid confusion with right actions. We shall mostly consider left group actions in the 
sequel, but right actions are occasionally more convenient to use (cf. 9.109). 


| 


9.12 Permutation Representations 


Group actions are closely related to symmetric groups. To understand the precise nature of 
this relationship, we need the following definition. 


9.91. Definition: Permutation Representations. A permutation representation of a 
group G on a set X is a group homomorphism ¢: G — Sym(X). 


This definition seems quite different from the definition of a group action given in the last 
section. But we will see in this section that group actions and permutation representations 
are essentially the same thing. Both viewpoints turn out to be pertinent in the application 
of group actions to problems in combinatorics and algebra. 

We first show that any group action of G on X gives rise to a permutation representation 
of G on X in a canonical way. The key idea appears in the next definition. 


9.92. Definition: Left Multiplication Maps. Let « : Gx X — X be an action of the 
group G on the set X. For each g € G, left multiplication by G (relative to this action) is 
the function L, : X — X defined by 


Lg(e)=g42 (we X), 
Note that L, does take values in X, by the closure axiom for group actions. 


9.93. Theorem: Properties of Left Multiplication Maps. Let (X,*) be a G-set. (a) 
Le = idx. (b) For all g,h € G, Lgn = Lg o Lp. (c) For all g € G, L, € Sym(X), and 


Lz = Lg-1. 


Proof. All functions appearing here have domain X and codomain X. So it suffices to check 
that the relevant functions take the same value at each x € X. For (a), Le(x) =exx=2= 
idx (x) by the identity axiom for group actions. For (b), Lgn(x) = (gh) *v# =g*(h* a) = 
Lg(Ln(x)) = (Lg°Ln) (x) by the associativity axiom for group actions. Finally, using (a) and 
(b) with h = g~! shows that idx = Lg o L,-1. Similarly, idx = L,-1 0 Lg. This means that 
L,-1 is the two-sided inverse of Lg; in particular, both of these maps must be bijections. O 


Using the theorem, we can pass from a group action * to a permutation representation 
@ as follows. Define ¢ : G — Sym(X) by setting ¢(g) = L, € Sym(X) for all g € G. By 
part (b) of the theorem, 


(gh) = Lgn =Lgoln=olg)og(h) (gy, hEG), 


and so ¢ is a group homomorphism. 
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9.94. Example: Cayley’s Theorem. We have seen that any group G acts on the set 
X =G by left multiplication. The preceding construction produces a group homomorphism 
@:G— Sym(G), such that d(g) = Lg = («+ gx: a € G). We claim that ¢ is injective in 
this situation. For, suppose g,h € G and L, = Ly. Applying these two functions to e (the 
identity of G) gives g = ge = L,(e) = Ln(e) = he = h. It follows that G is isomorphic (via 
¢) to the image of ¢, which is a subgroup of the symmetric group Sym(G). We have just 
proved Cayley’s Theorem, which says that any group is isomorphic to a subgroup of some 
symmetric group. If G has n elements, one can check that Sym(G) is isomorphic to S;,. So 
every n-element group is isomorphic to a subgroup of the specific symmetric group Sy. 


9.95. Example. Recall that, for any set X, Sym(X) acts on X via f * a = f(a) for 
f € Sym(X) and « € X. What is the associated permutation representation ¢ : Sym(X) > 
Sym(X)? First note that for f ¢ Sym(X), left multiplication by f is the map Ly: X — X 
such that Ly(z) = f xx = f(x). In other words, Ly = f, so that ¢(f) = Ly = f. This 
means that ¢ is the identity homomorphism. More generally, whenever a subgroup H of 
Sym(X) acts on X in the canonical way, the corresponding permutation representation is 
the inclusion map of H into Sym(X). 


So far, we have seen that every group action of G on X has an associated permuta- 
tion representation. We can reverse this process by starting with an arbitrary permutation 
representation ¢ : G — Sym(X) and building a group action, as follows. Given ¢, define 
*: Gx X — X by setting g * « = 6(g)(x) for all g € G and x € X. Note that ¢(g) is 
a function with domain X, so the expression ¢(g)(z) denotes a well-defined element of X. 
In particular, * satisfies the closure axiom in 9.80. Since group homomorphisms preserve 
identities, ¢(e) = idx, and so ex x = ¢(e)(x) = idx (x) = z for all w € X. So the identity 
axiom holds. Finally, using the fact that ¢ is a group homomorphism, we calculate 


(gh)*x = d(gh)(z) = (¢(g) 0 d(h))(@) 
= $(9)(P(h)(a)) = 9 * (h* 2). 


So the associativity axiom holds, completing the proof that * is a group action. 
The following theorem is the formal enunciation of our earlier claim that group actions 
and permutation representations are “essentially the same concept.” 


9.96. Theorem: Equivalence of Group Actions and Permutation Representa- 
tions. Fix a group G and a set X. Let A be the set of all group actions of G on X, and 
let P be the set of all permutation representations of G on X. There are mutually inverse 
bijections F': A— P and H : P — A, given by 


F(*) =@:G— Sym(X) where ¢(g) = L, =(at@-+g*u:xr€X); 
H(¢) =*:Gx X — X where g * x = $(g)(x) (gE€G,ceE xX). 


Proof. The discussion preceding the theorem has shown that F' does map the set A into 
the stated codomain P, and that H does map the set P into the stated codomain A. We 
need only verify that Fo H =idp and H o F = idy. 

To show F o H = idp, fix ¢ € P, and write x = H(d) and w = F(x). We must confirm 
that = 6: G — Sym(X). To do this, fix g € G, and ask whether the two functions 
w(g), O(g) : X > X are equal. For each x € X, 


V(g)(@) = L(x) = 9 * x = (9) (2). 
So w(g) = ¢(g) for all g, hence w = ¢ as desired. 
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To show H o F = ida, fix x € A, and write ¢ = F(x), e = H(¢). We must confirm that 
e = x. For this, fix g € G and « € X. Now compute 


ger = $(g)(x) = L(x) =g*z. Oo 


9.97. Example. We can use permutation representations to generate new constructions of 
group actions. For instance, suppose (X,*) is a G-set with associated permutation repre- 
sentation ¢: G — Sym(X). Now suppose we are given a group homomorphism u: K > G. 
Composing with ¢ gives a homomorphism ¢ou : K — Sym(X). This is a permutation 
representation of K on X, which means that X can be made into a K-set in a canonical 
way. Specifically, by applying the map H from the theorem, we see that the K-action on X 
is given by 
kex=ul(k)*x (ke K,x eX). 


i  —E—=—E ee 


9.13. Stable Subsets and Orbits 


One way to gain information about a group is to study its subgroups. The analogous concept 
for G-sets appears in the next definition. 


9.98. Definition: G-Stable Subsets. Let (X,*) be a G-set. A subset Y of X is called a 
G-stable subset iff g*xy € Y for allge Gandallyc€Y. 


When Y is a G-stable subset, the restriction of * to G x Y maps into the codomain Y, 
by definition. Since the identity axiom and associativity axiom still hold for the restricted 
action, we see that Y is a G-set. 

Recall that every element of a group generates a cyclic subgroup. Similarly, we can pass 
from an element of a G-set to a G-stable subset as follows. 


9.99. Definition: Orbits. Suppose (X,*) is a G-set, and x € X. The G-orbit of x is the 
set 
Gr=Gxer={geu:geEG}Cx. 


Every orbit is a G-stable subset: for, given h € G and g*a € Ga, the associativity axiom 
gives h « (g * x) = (hg) xx € Ga. Furthermore, by the identity axiom, x = e * x € Gu for 
each « € X. 


9.100. Example. Let S5 act on the set X = {1,2,3,4,5} via fx a= f(x) for f € S5 and 
x € X. For each i € X, the orbit Ss «7 = {f(t) : f © Ss} is all of X. The reason is that for 
any given j in X, we can find an f € Ss such that f(i) = 7; for instance, take f = (7,7). 
On the other hand, consider the subgroup H = ((1,3)(2,4,5)) of S5. If we let H act on X 
via f*a = f(x) for f € H anda € X, we get different orbits. One may check directly that 


Hel=H*3=(1,3}, H*x2=H*x4=H«5 = {2,45}. 


Note that the H-orbits are precisely the connected components of the digraph representing 
the generator (1,3)(2,4,5) of H. One can verify that this holds in general whenever a cyclic 
subgroup of S,, acts on {1,2,...,n}. 

Now consider the action of A; on X. As in the case of Ss, we have As *i = X for all 
i € X, but for a different reason. Given 7 € X, we must now find an even permutation 
sending i to 7. We can use the identity permutation if 1 = 7. Otherwise, choose two distinct 
elements k,! that are different from i and j, and use the permutation (i, j)(k, 1). 
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9.101. Example. Let S4 act on the set X of all 4-tuples of integers by permuting positions: 


f * (a1, %2,03, £4) = (x f-1(1), © f-1(2), © f-1(3), © f-1(4)) (f € S4, 24 € Z). 
The S4-orbit of a sequence x = (x1, 22,23, 24) consists of all possible sequences obtainable 
from x by permuting the entries. For example, 
S4 * (5, 1, 5, 1) = {(1, 1, 5, 5), (1, 5, 1, 5), (1, 5, 5, 1), (5, i 1, 5), (5, 1; 5, 1), (5, 5, 1, 1)}. 


As another example, $4 * (3,3,3,3) = {(3,3,3,3)} and Sy * (1,3,5,7) is the set of all 
24 permutations of this list. Now consider the cyclic subgroup H = ((1,2,3,4)) of Sy. 
Restricting the action turns X into an H-set. When computing orbits relative to the H- 
action, we are only allowed to cyclically shift the elements in each 4-tuple. So, for instance, 
H x (5, 1,5, 1) = {(5, 1,5, 1), (1,5, 1, 5)}; 
Al * Gls 3, 5, 7) {(1, 3, 5, Ts (3, 5, 7, 1), (5, 7, 1, 3), ee dl; 3, 5)}. 


I 


As before, the orbit of a given x € X depends heavily on which group is acting on X. 


9.102. Example. Let a group G act on itself by left multiplication: g*a = gx for g,x € G. 
For every x € G, the orbit Gz is all of G. For, given any y € G, we have (yx~!) «x = y. 
In the next section, we will study what happens when a subgroup H acts on G by left (or 
right) multiplication. 


9.103. Example: Conjugacy Classes. Let G be a group. We have seen that G acts on 
itself by conjugation: g * « = gxg~' for g,x € G. The orbit of x € G under this action is 
the set 

Gane ={geg*: 9 € Gh. 


This set is called the conjugacy class of x in G. For example, when G = $3, the conjugacy 
classes are 


Grid = {id}; 
4 (1,2) = Ge, 3)\= G2, 3) (12) (1,3), 253) }5 
G « (1,2,3) =G@*(1,3,2) = {(1,2,3), (1,3,2)}. 


One can confirm this with the aid of the identities 


fol sof =F@OFG)s foi kof" =(F@,FG) Fk) (fF € 83). 


(The generalization of this example to any S,, is discussed in §9.16.) We observe in passing 
that G*a = {x} iff grg-! = 2 for all g € G iff gx = xq for all g € G iff x commutes with 
every element of G. In particular, for G commutative, every conjugacy class consists of a 
single element. 


9.104. Example. Let B = (X, FE) be the graph representing a 5 x 5 chessboard shown in 
Figure 9.3. Let the graph automorphism group G = Aut(B) act on X via f * 2 = f(x) for 
f € Gand x € X. We explicitly determined the elements of G in 9.67. We can use this 
calculation to find all the distinct G-orbits. They are: 
Ga = {a,e,z,v} = Ge = Gz = Gu; 
Gb {b,d,j,u,y,w,¢, fs} =Gd=Gj=--:; 
Ge = {ce,p,x,k}; 
Gg = {g9,1,t,r}; 
Gh = {h,n,s,1}; 
Gm = {m}. 


l 
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The reader may have noticed in these examples that distinct orbits of the G-action on 
X are always pairwise disjoint. We now prove that this always happens. 


9.105. Theorem: Orbit Decomposition of a G-set. Let (X,*) be a G-set. Every 
element « € X belongs to exactly one G-orbit, namely Ga. In other words, the distinct 
G-orbits of the action * form a set partition of X. 


Proof. Define a relation on X by setting, for z,y € X, x ~ y iff y = g* x for some 
g € G. This relation is reflexive on X: given x € G, we have x = e* 2, sox ~ x. This 
relation is symmetric: given z,y € X with x ~ y, we know y = g* x for some g € G. 
A routine calculation shows that 2 = g~! * y, so y ~ x. This relation is transitive: given 
x,y,z€ X withae~y andy ~ z, we know y = g* x and z=h*xy for some g,h € G. So 
z=hx(g* x) = (hg) * ax, and x ~ z. Thus we have an equivalence relation on X. Recall 
from the proof of 2.55 that the equivalence classes of any equivalence relation on X form a 
set partition of X. In this situation, the equivalence classes are precisely the G-orbits, since 
the equivalence class of x is 


fyeX:any}={yeX: y= g*-x for some g € G} = Gur. oO 
9.106. Corollary. Every group G is the disjoint union of its conjugacy classes. 


Everything we have said can be adapted to give results on right actions. In particular, if 
G acts on X on the right, then X is partitioned into a disjoint union of the right G-orbits 


“G = {xx«xg:9gEG} (x € X). 


| Fn 


9.14 Cosets 


The idea of a coset plays a central role in group theory. Cosets arise as the orbits of a certain 
group action. 


9.107. Definition: Right Cosets. Let G be a group, and let H be any subgroup of G. 
Let H act on G by left multiplication: h * « = ha for h € H and x € G. The orbit of x 
under this action, namely 

Hu = {hx:he H} 


is called the right coset of x relative to H. 


By the general theory of group actions, we know that G is the disjoint union of its right 
cosets. 


9.108. Example. Let G = S3 and H = {id, (1,2)}. The right cosets of H in G are 


Hid=H(1,2) = {id,(1,2)} =H; 
H(1,3) =H(1,3,2) = {(1,3),(1,3,2)}; 
H(2,3) =H(1,2,3) = {(2,3),(1,2,3)}. 


For the subgroup K = {id, (1, 2,3), (1,3,2)}, the right cosets are 
Kid = K and K(1,2) = {(1,2), (2,3), (1,3)}. 


Note that the subgroup itself is always a right coset, but the other right cosets are not 
subgroups (they do not contain the identity of G). 
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By letting H act on the right, we obtain the notion of a left coset, which will be used 
frequently in the sequel. 


9.109. Definition: Left Cosets. Let G be a group, and let H be any subgroup of G. Let 
HA act on G by right multiplication: « *h = xh for h € H and x € G. The orbit of x under 
this action, namely 

cH ={xth:he H} 


is called the left coset of x relative to H. 
By 9.105, G is the disjoint union of its left cosets. 
9.110. Example. Let G = 53 and H = {id, (1,2)} as above. The left cosets of H in G are 
idH=(1,2)H = {fid,(1,2)} = 4H; 
(1,3)M = (1,2,3)H = {(1,3), (1,2,3)}; 
(2,3)M = (1,3,2)H = {(2,3), (1,3, 2)}. 
Observe that cH # Hx except when x € H. This shows that left cosets and right cosets 
do not coincide in general. On the other hand, for the subgroup K = {id, (1, 2,3), (1,3,2)}, 


the left cosets are K and (1,2)K = {(1, 2), (1,3), (2,3)}. One checks that xk = Kz for all 
x € S3, so that left cosets and right cosets do coincide for some subgroups. 


Although x = y certainly implies zH = yH, one must remember that the converse is 
almost always false. The next result gives criteria for deciding when two cosets cH and yH 
are equal; it is used constantly in arguments involving cosets. 


9.111. Coset Equality Theorem. Let H be a subgroup of G. For all 2,y € G, the 
following conditions are logically equivalent: 


(a) cH = yH. (a’) yH = «H 

(b) « € yH. (b’) ye xH. 

(c) There exists h € H with x = yh. (c’) There exists h’ € H with y = ch’. 
idyy eH, (d’) eye H 


Proof. We first prove (a)=>(b)=(c)=(d)=(a). If cH = yH, then x = xe € «H = yH, so 
x € yH. If x € yH, then x = yh for some h € H by definition of yH. If « = yh for some 
h € H, then multiplying by y~! on the left gives y~tx € H. Finally, assume that y~!x € H. 
Then y(y~tx) = a lies in the orbit yH. We also have x = xe € rH. As orbits are either 
disjoint or equal, we must have cH = yH. 

Interchanging x and y in the last paragraph proves the equivalence of (a’), (b’), (c’), 
and (d’). Since (a) and (a’) are visibly equivalent, the proof is complete. Oo 


9.112. Remark. The equivalence of (a) and (d) in the last theorem is used quite frequently. 
Note too that the subgroup H is a coset (namely eH), and «H = H iff cH = eH iff 
eta € H iff x € H. Finally, one can prove an analogous theorem for right cosets. The key 
difference is that Hx = Hy iff ry~' € H iff yx! € H (so that inverses occur on the right 
for right cosets). 


We can use cosets to construct more examples of G-sets. 


9.113. The G-set G/H. Let G be a group, and let H be any subgroup of G. Let G/H be 
the set of all distinct left cosets of H in G. Every element of G/H is a subset of G of the 
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form «H = {xh:h € H} for some x € G (which is usually not unique). So, G/H is a subset 
of the power set P(G). Let the group G act on the set X = G/H by left multiplication: 


g*xS={gs:s€S} (gEG,SeExX). 


Note that this action is the restriction of the action from 9.85 to G x X. To see that the 
action makes sense, we must check that X is a G-stable subset of P(G). Let xH be an 
element of X and let g € G; then 


g* (cH) = {g(ah) : he Hh} ={(gr)h: he A} =(grz)H € X. 


Let [G : H] =|G/H| (which may be infinite); this cardinal number is called the index of H 
in G. Lagrange’s Theorem (below) will show that |G/H| = |G|/|H| when G is finite. 


9.114. Remark. Using the coset equality theorems, one can show that cH +> Hx! gives 
a well-defined bijection between G/H and the set of right cosets of H in G. So, we would 
obtain the same number [G': H] if we had used right cosets in the definition of G/H. It is 
more convenient to use left cosets here, so that G can act on G/H on the left. 


9.115. Example. If G = S3 and H = {id, (1,2)}, then 
G/H = {{id, (1, 2)}, {(1, 3), (1, 2, 3)}, {(2, 3), (1,3, 2)}} = {id H, (1,3)H, (2,3) HA}. 


We have [G : H] = |G/H| = 3. Note that |G|/|H| = 6/2 = 3 = |G/H]. This is a special 
case of Lagrange’s Theorem, proved below. 


To prepare for Lagrange’s Theorem, we first show that every left coset of H in G has 
the same cardinality as G. 


9.116. Coset Size Theorem. Let H be a subgroup of G. For all x € G, |xH| = |A|. 


Proof. We have seen that the left multiplication Dz : G— G, given by g+- xg for g € G, is 
a bijection (with inverse L,-1). Restricting the domain of L, to H gives an injective map 
L!,: H — G. The image of this map is {th : h € H} = xH. So, restricting the codomain 
gives a bijection from H to «H. Thus, the sets H and +H have the same cardinality. O 


9.117. Lagrange’s Theorem. Let H be any subgroup of a finite group G. Then 
[G : H] -|H| = |G]. 
So |H| and [G : H] are divisors of |G|, and |G/H| = [G: H] =|G|/|H]. 


Proof. We know that G is the disjoint union of its distinct left cosets: G = Useayx S. By 
the previous theorem, |S| = |H| for every S € G/H. So, by the sum rule, 


IGi= S° |s|= S° |B] =|G/H|-|G| =[G: H]-|Al. Oo 
SEG/H SEG/H 


9.118. Remark. The equality of cardinal numbers |H| - [|G : H] = |G| holds even when G 
is infinite, with the same proof. 


9.119. Theorem: Order of Group Elements. If G is a finite group of size n and x € G, 
then the order of x is a divisor of n, and x” = eg. 


Proof. Consider the subgroup H = (x) generated by x. The order d of x is |H|, which divides 
|G| = n by Lagrange’s theorem. Writing n = cd, we see that 2” = (x4)° = e° =e. O 
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The next result gives an interpretation for cosets xK in the case where K is the kernel 
of a group homomorphism. 


9.120. Theorem: Cosets of the Kernel of a Homomorphism. Let f :G-— L bea 
group homomorphism with kernel K. For every x € G, 


ek = {yeG: fly) = f(@)} = Ke. 
If G is finite and I is the image of f, it follows that |G] = |K|- |Z]. 


Proof. Fix « € G, and set S={y eG: f(y) = f(x)}. We will prove that ck = S. First 
suppose y € «K, so y = xk for some k € K. Applying f, we find that f(y) = f(ak) = 
f(x)f(k) = f(ajex = f(x), so y € S. Next suppose y € S, so f(y) = f(x). Note that 
f(xy) = f(x)'f(y) = e, so x~ly € ker(f) = K. So y = 2(x~'y) € xK. The proof that 
S = Kz is analogous. To obtain the formula for |G|, note that G is the disjoint union 


G=UtyeG: fly) = 4. 


zel 


Every z € I has the form z = f(x) for some x € G. So, by what we have just proved, each 
set appearing in the union is a coset of K’, which has the same cardinality as kK. So the sum 


rule gives |G| = )0,-,|K| = |K|-|JI. Oo 


9.121. Corollary: Size of A,. For n > 1, |A,| = n!/2. 


Proof. We know that sgn : S,, — {1,—1} is a surjective group homomorphism with kernel 
Ay. So n! = |S;,| = |An| - 2. Oo 


9.15 The Size of an Orbit 


In 9.105, we saw that every G-set X breaks up into a disjoint union of orbits. This result 
suggests two combinatorial questions. First, given x € X, what is the size of the orbit Ga? 
Second, how many orbits are there? We answer the first question here; the second question 
will be solved in §9.18. 

The key to computing the orbit size |Gz| is to relate the G-set Gx to one of the special 
G-sets G/H defined in 9.113. For this purpose, we need to associate a subgroup H of G to 
the given orbit Ga. 


9.122. Definition: Stabilizers. Let (X,«) be a G-set. For each x € X, the stabilizer of 
x inG is 

Stab(z) = {gE G:gerx=ah. 
Sometimes the notation G, is used to denote Stab(x). 


The following calculations show that Stab() is a subgroup of G for each x € X: exx = 2; 
g*x=x implies x =g~!*a2 forgé€ G;g*x=x2=h*z implies (gh) *r=g*(h*x)=2 
for g, he G. 


9.123. Example. Let S,, act on X = {1,2,...,n} via fxa = f(x) for fe S, andre X. 
The stabilizer of a point 7 € X consists of all permutations of X for which 7 is a fixed point. 
In particular, Stab(n) consist of all bijections f : X — X with f(n) =n. Restricting the 
domain to {1,2,...,2— 1} defines a group isomorphism between Stab(n) and S,,_1. 
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9.124. Example. Let a group G act on itself by left multiplication. Right cancellation of x 
shows that gx = x iff g = e. Therefore, Stab(x) = {e} for all « € G. At the other extreme, 
we can let G act on any set X by declaring g * x = x for all g € G and all x € X. Relative 
to this action, Stab(x) = G for all x € X. 


9.125. Example: Centralizers. Let G act on itself by conjugation: g «2 = gxg~' for all 
g,v € G. For a given x € G, g € Stab(z) iff grg~! = x iff gx = xg iff g commutes with z. 
This stabilizer subgroup is often denoted C¢(x) and called the centralizer of x in G. The 
intersection (),,-q Ca(«) consists of all g € G that commute with every x € G. This is a 
subgroup called the center of G and denoted Z(G). 


9.126. Example: Normalizers. Let G be a group, and let X be the set of all subgroups 
of G. G acts on X by conjugation: g* H = gHg~! = {ghg-!:he€ H}. (Note that g* H is 
a subgroup, since it is the image of a subgroup under the inner automorphism “conjugation 
by g”; cf. 9.74.) For this action, g € Stab(H) iff gHg~' = H. This stabilizer subgroup is 
denoted Ne(H) and called the normalizer of H in G. One may check that Nce(H) always 
contains H. 


9.127. Example. Let $4 act on 4-tuples of integers by permuting the positions. Then 
Stab((5,1,5,1)) = {id, (1,3), (2,4), (1,3)(2,4)}; Stab((2, 2,2,2)) = $4; Stab((1,2,3,4)) = 
{id}; and Stab((2, 5, 2, 2)) is a subgroup of S4 isomorphic to Sym({1, 3, 4}), which is in turn 
isomorphic to $3. 


The following fundamental theorem calculates the size of an orbit of a group action. 


9.128. Theorem: Size of an Orbit. Let (X,*) be a G-set. For each « € X, there is a 
bijection f : G/Stab(a) — Ga given by f(g Stab(x)) = g * x for all g € G. So, when G is 
finite, the size of the orbit of x is the index of the stabilizer of x, which is a divisor of |G|: 


|Gz| = [G : Stab(z)] = |G|/| Stab(x)}. 


Proof. Write H = Stab(x) for convenience. We first check that the function f : G/H — Ga 
is well defined. Assume g,k € G satisfy gH = kH; we must check that g * « =k «a. Now, 
gH = kH means k~'g € H = Stab(x), and hence (k~'g) «x = x. Acting on both sides 
by & and simplifying, we obtain g * x = k * x. Second, is f one-to-one? Fix g,k € G with 
f(gH) = f(kH); we must prove gH = kH. Now, f(gH) = f(kH) means gtx =k xa. 
Acting on both sides by k~!, we find that (k~'g)*a = 2, so k~'g € H, so gH = kH. Third, 
is f surjective? Given y € Gz, the definition of Ga says that y = g * x for some g € G, so 
y = f(gH). In summary, f is a well-defined bijection. oO 


9.129. Remark. One can prove a stronger version of the theorem, analogous to the “funda- 
mental homomorphism theorem for groups,” by introducing the following definition. Given 
two G-sets (X,*) and (Y,e), a G-map is a function p: X — Y such that p(g*x) = ge p(x) 
for all g € G and all x € X. A G-isomorphism is a bijective G-map. The theorem gives us 
a bijection p from the G-set Gx to the G-set G/ Stab(x) such that p(go * x) = go Stab(z). 
This bijection is in fact a G-isomorphism, because 


P(g * (go * x)) = p((ggo) * x) = (ggo) Stab(x) = g e (go Stab(x)) = g © p(go * 2). 


Since every G-set is a disjoint union of orbits, this result shows that the special G-sets of 
the form G/H are the “building blocks” from which all G-sets are constructed. 


Applying 9.128 to some of the preceding examples gives the following corollary. 
9.130. Corollary: Counting Conjugates of Group Elements and Subgroups. The 
size of the conjugacy class of x in a finite group G is [G : Stab(x)] = [G : Ce(x)] = 


|G|/|Ca(a)|. If H is a subgroup of G, the number of distinct conjugates of H (subgroups of 
the form gHg~') is [G : Stab(H)] = [G: Ne(A)] = |G|/|Na(A)|. 
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9.16 Conjugacy Classes in S,, 


The conjugacy classes in the symmetric groups S;, can be described explicitly. We shall 
prove that the conjugacy class of f € S;, consists of all g € S, with the same cycle type as 
f (see 9.20). The proof employs the following computational result. 


9.131. Theorem: Conjugation in S,. For f,g € Sn, the permutation gfg~! can be 


obtained by applying g to each entry in the disjoint cycle decomposition of f. In other 
words, if 
f = (4, t2, 3,--)(f1,J2,---)(k1, kay. eee, 
then 
gf 9-* = (9(%1), g(é2), g(és),---)(9(51), 9G2),---)(g(F1), 9(k2),---) 
In particular, type(gfg~!) = type(f). 


Proof. First assume f is a k-cycle, say f = (i1,%2,...,%,). We prove that the functions 
gfg-* and h = (g(i1), g(iz2),.--,g(in)) are equal by showing that both have the same effect 
on every x € {1,2,...,n}. We consider various cases. First, if 2 = g(is) for some s < k, 
then gfg-*(x) = gfg*(g(ts)) = g(f(is)) = glis+1) = h(x). Second, if x = g(ix), then 
gfg-'(x) = g(flix)) = g(i1) = h(x). Finally, if x does not equal any g(is), then g~!(x) 
does not equal any is. So f fixes g~'(x), and gfg~*(x) = g(g7'(x)) =x = h(a). 

In the general case, write f = CyoC20---oC;, where each C; is a cycle. Since conjugation 
by g is a homomorphism, 


gf g”* = (gCig™*) 0 (gC2g*) 0+++ 0 (gCrg™"). 


By the previous paragraph, we can compute gC;g~! by applying g to each element of C;. 
This completes the proof. O 


9.132. Theorem: Conjugacy Classes of S,,. The conjugacy class of f € S, consists of 
allh € S,, with type(h) = type(f). The number of conjugacy classes is p(n), the number of 
integer partitions of n. 


Proof. Fix f € Sp; let T = {gfg-+:g € Sn} be the conjugacy class of f, and let U = {he 
S,, : type(h) = type(f)}. Using 9.131, we see that T C U. For the reverse inclusion, let 
h € S, have the same cycle type of f. We give an algorithm for finding a g € S, such that 
h = gfg—'. Write down any complete cycle decomposition of f (including 1-cycles), writing 
longer cycles before shorter cycles. Immediately below this, write down a complete cycle 
decomposition of h. Now erase all the parentheses and regard the resulting array as the 
two-line form of a permutation g. Then 9.131 shows that gfg~' = h. For example, suppose 


f= (1,7, 3)(2, 8, 9)(4, 5) (6) 
h = (4,9,2)(6,3,5)(1,8)(7) 


The g constructed here is not unique; we could obtain different g’s satisfying gfg~' = h by 
starting with a different complete cycle decomposition for f or h. 

The last statement of the theorem follows since the possible cycle types of permutations 
of n objects are exactly the integer partitions of n (weakly decreasing sequences of positive 
integers that sum to n). O 
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We now apply 9.130 to determine the sizes of the conjugacy classes of S,,. 


9.133. Definition: z,,. Let 4 be an integer partition of n consisting of a; ones, ag twos, 
m Lb ger p § 
etc. Define 
Zp = 12% --- nay lag!- ++ an!. 


For example, for w = (3,3,2,2,2,2,1,1,1,1,1), we have a; = 5, ag = 4, ag = 2, and 
Zp = 15243251412! = 829, 440. 


9.134. Theorem: Size of Conjugacy Classes of S,. For each y € Par(n), the number 
of permutations f € S, with type(f) = pis n!/zp. 


Proof. Fix a particular f € S, with type(f) = yu. By 9.130 and the fact that |S;,| = nl, 
it is enough to show that |Cs,(f)| = z,. The argument is most readily understood by 
consideration of a specific example. Let uw = (3,3, 2,2,2,2,1,1,1,1,1) as above, and take 


f = (1, 2,3)(4, 5, 6)(7, 8)(9, 10) (11, 12) (13, 14)(15)(16)(17)(18) (19). 


A permutation g € S,, lies in Cg, (f) iff gfg~! = f iff applying g to each symbol in the 
cycle decomposition above produces another cycle decomposition of f. So we are reduced to 
counting the number of ways of writing down a complete cycle decomposition of f such that 
longer cycles come before shorter cycles. Note that we have freedom to rearrange the order 
of all cycles of a given length, and we also have freedom to cyclically permute the entries 
in any given cycle of f. For example, we could permute the five 1-cycles of f in any of 5! 
ways; we could replace (4, 5,6) by one of the three cyclic shifts (4, 5,6) or (5,6, 4) or (6, 4,5); 
and so on. For this particular f, the product rule gives 2!4!5!37241° = z,, different possible 
complete cycle decompositions. The argument for the general case is similar: the term a,! 
in z, accounts for permuting the a; cycles of length i, while the term 7° accounts for the 
i possible cyclic shifts of each of the a; cycles of length 7. Multiplying these contributions 
gives z,, as desired. O 


a 


9.17 Applications of the Orbit Size Formula 


When a finite group G acts on a finite set X, 9.128 asserts that the size of the orbit Gz is 
|G|/| Stab()|, which is a divisor of |G|. We now use this fact to establish several famous 
theorems from algebra, number theory, and combinatorics. 


9.135. Fermat’s Little Theorem. For every integer a > 0 and every prime p, a? = a 
(mod p). 


Proof. Let Y = {1,2,...,a}, and let X = Y” be the set of all p-tuples (y1,...,Yp) of 
elements of Y. By the product rule, |X| = a”. We know that 5, acts on X by permuting 
positions (see 9.89). Let H = ((1,2,...,p)), which is a cyclic subgroup of S, of size p. 
Restricting the action to H, we see that H acts on X by cyclically shifting positions. The 
only divisors of the prime p are 1 and p, so all orbits of X under the H-action have size 1 
or p. Since X is the disjoint union of the orbits, |X| is congruent modulo p to the number 


of orbits of size 1. But one sees immediately that w = (y1,..., yp) is in an orbit of size 1 iff 
all cyclic shifts of w are equal to w iff y, =--- = yp € Y. So there are precisely a orbits of 
size 1, as desired. O 


9.136. Cauchy’s Theorem. Suppose G is a finite group and p is a prime divisor of |G]. 
Then there exists an element x € G of order p. 
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Proof. As in the previous proof, the group H = ((1,2,...,p)) acts on the set G? by cyclically 
permuting positions. Let X consist of all p-tuples (g1,..., 9p) € G? such that gi g2--- gp =e. 
We can build a typical element of X by choosing g1,...,9p)-1 arbitrarily from G; then we 
are forced to choose g, = (g1---gp—1)~! to achieve the condition 9192--+Gp—19p = e. The 
product rule therefore gives |X| = |G|?~1, which is a multiple of p. 

We next claim that X is an H-stable subset of G?. This means that for all i < p, 
9192°**Gp = e implies gigit1--+9pgi-::gi-1 = e. To prove this, multiply the equation 
9192°** Gp = e€ by (gig2-+-gi-1)~* on the left and by (gig2---gi—1) on the right. We now 
know that X is an H-set, so it is a union of orbits of size 1 and size p. Since |X| is a multiple 
of p, the number of orbits of size 1 must be a multiple of p as well. Now, (e,e,...,e) is one 
orbit of size 1; so there must exist at least p—1 > 0 additional orbits of size 1. By definition 
of the H-action, such an orbit looks like (x,2,...,2) where x 4 e. By definition of X, we 
must have x? = e. Since p is prime, we have proved the existence of an element x of order 
p (in fact, we know there are at least p — 1 such elements). O 


9.137. Lucas’ Congruence for Binomial Coefficients. Suppose p is prime and 0 < 
k < n are integers. Let n and k have base-p expansions n = )7,.) nip’, k = Do jso ki’, 
where 0 < n;, ki < p (see 5.5). Then 7 a 


(,) =I (i) moa, (9.6) 


i>0 
where we set (3) = land (*) = 0 whenever b > a. 


Proof. Step 1: For all 7 > 0, m > 0, and p prime, we show that 


SoG) ats pcs on 


To prove this identity, let X = {1,2,...,m-+p}, and let Y be the set of all j-element subsets 
of X. We know that |Y| = Gat) Consider the subgroup G = ((1,2,...,p)) of Sym(X), 
which is cyclic of size p. G acts on Y via gx S = {g(s):s € S} forge Gand Sey. 

Y is a disjoint union of orbits under this action. Since every orbit has size 1 or p, |Y| is 
congruent modulo p to the number M of orbits of size 1. We will show that M = (2) + (ea) 
The orbits of size 1 correspond to the j-element subsets S of X such that gx S = S for 
all g € G. It is equivalent to require that f*«S = S for the generator f = (1,2,...,p) of 
G. Suppose S' satisfies this condition, and consider two cases. Case 1: 91 {1,2,...,p} = 9. 
Since f(a) = x for > p, we have f x S = S for all such subsets S. Since S can be an 
arbitrary subset of the m-element set {p+1,...,m-+p}, there are CG) subsets of this form. 
Case 2: SM {1,2,...,p} #0. Say i € S where 1 <i <p. Applying f repeatedly and noting 
that fx S = S, we see that {1,2,...,p} C S. The remaining j — p elements of S can be 


chosen arbitrarily from the m-element set {p+1,...,m-+ p}. So there are eae subsets of 
this form. Combining the two cases, we see that M = ("’) + (,”)). 


Step 2: Assume p is prime, a,c > 0, and 0 < b,d < p; we show that (2?*") = ()(°) 
(mod p). This will follow from step 1 and the identity ("{') = (7) + (,,",) (see 2.25). We 
argue by induction on a. The base step is a = 0. If a = 0 and c > 0, both sides of the 
congruence are zero; if a = 0 = c, then both sides of the congruence are ey Assuming that 


the result holds for a given a (and all b,c, d), the following computation shows that it holds 


Ss 
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fora +1: 
(282) (meta) (are) OEIC 


[O+O=C290) oan 


Step 3: We prove Lucas’ congruence (9.6) by induction on n. If k > n, then k; > n, for 
some i, so that both sides of the congruence are zero. From now on, assume k < n. The 
result holds in the base cases 0 < n < p, since n = no, k = ko, and all higher digits of 
the base p expansion are zero. For the induction step, note that n = ap+no, k= cp+ko, 
where a = )>j59 ni+1p' and c = 0; ki41p' in base p. (We obtain a and c from n and k, 
respectively, by chopping off the final base p digits n9 and ko.) By step 2 and induction, we 


() =()(e) = Ce) Ce) =H (a) mor a 


u 
9.138. Corollary. Given a,b,p € N* with p prime and p not dividing b, 
aa) 
p does not divide € ) 
pe 


Proof. Write b = Yojs0 bp’ in base p. The base-p expansions of pb and p% are p%b = 
--+63b2b1b900---0 and p* = 1000---0, respectively, where each expansion ends in a zeroes. 
Since bp 4 0 by hypothesis, Lucas’ congruence gives 


("") = (”) =b) 0 (mod p). 2 


This corollary can also be proved directly, by writing out the fraction defining ie ee) and 
counting powers of p in numerator and denominator. We leave this as an exercise for the 


reader. 


9.139. Sylow’s First Theorem. Let G be a finite group of size p%b, where p is prime, 
a > 0, and p does not divide b. There exists a subgroup H of G of size p®. 


Proof. Let X be the collection of all subsets of G of size p*. We know from 1.42 that 
|X| = Ce) By 9.138, p does not divide |X|. Now, G acts on X by left multiplication: 
g*S ={gs:s € S} forg € Gand S € X. (The set gx S still has size p*, since left 
multiplication by g is injective.) Not every orbit of X has size divisible by p, since |X| itself 
is not divisible by p. Choose T € X such that |GT| #0 (mod p). Let H = Stab(T) = {g € 
G:g*T =T}, which is a subgroup of G. The size of the orbit of T is |G|/|H| = p*b/|H|. 
This integer is not divisible by p, forcing |H| to be a multiple of p*. So |H| > p*. To obtain 
the reverse inequality, let t) be any fixed element of 7. Given any h € H, hx T = T implies 
htg € T. So the right coset Htg = {hto : h € H} is contained in T. We conclude that 
|H| = |Hto| < |T| = p*. Thus H is a subgroup of size p® (and T is in fact one of the right 
cosets of H). Oo 


DT 


9.18 The Number of Orbits 


The following theorem, which is traditionally known as “Burnside’s Lemma,” allows us to 
count the number of orbits in a given G-set. 
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9.140. Orbit-Counting Theorem. Let a finite group G act on a finite set X. For each 
g € G, let Fix(g) = {a € X : gz = x} be the set of “fixed points” of g, and let N be the 
number of distinct orbits. Then 


1 ; 
N= 7 Do |Fix(a)|: 
geG 


So the number of orbits is the average number of fixed points of elements of G. 


Proof. Define f : X — R by setting f(x) = 1/|Gz| for each x € X. We will compute 
Dizex f(x) in two ways. Let {O1,...,On} be the distinct orbits of the G-action. On one 
hand, grouping summands based on which orbit they are in, we get 


N N N 
1 
De dod Fey DG Dep a 
rEX i=1 2€O; i=1a2c0O;'~' i=1 
On the other hand, 9.128 says that |Gz| = |G|/| Stab(x)|. Therefore 


|Stab(z)| 1 
Sa. = fa) a oe De x9 = 2) 


wEexX wEX wEX geEG 


1 1 
= Tay oe, De Kg = #) = Tey DT IFix(a)I- oO 


geG rex gEG 


We are finally ready to solve the counting problems involving symmetry that were men- 
tioned in the introduction to this chapter. The strategy is to introduce a set of objects X on 
which a certain group of symmetries acts. Each orbit of the group action consists of a set of 
objects in X that get identified with one another when symmetries are taken into account. 
So the solution to the counting problem is the number of orbits, which may be calculated 
by the formula of the previous theorem. 


9.141. Example: Counting Necklaces. How many ways can we build a five-bead circular 
necklace if there are seven available types of gemstones (repeats allowed) and all rotations 
of a given necklace are considered equivalent? We can model the set of necklaces (before 
accounting for symmetries) by the set of words X = {(y1, y2, ys, Y4, Ys) : 1 < yi < 7}. Now 
let G = ((1,2,3,4,5)) act on X by cyclically permuting positions (see 9.89). Every orbit 
of G consists of a set of necklaces that get identified with one another when symmetry is 
taken into account. To count the orbits, let us compute | Fix(g)| for each g € G. First, 
id = (1)(2)(3)(4)(5) fixes every object in X, so | Fix(id)| = |X| = 7° by the product rule. 
Second, the generator g = (1, 2,3, 4,5) fixes (y1, ye, ys, ya, Ys) iff 


(Y1, Y2sY3s Yas YS) = (Y5s V1, Y2, Y3, Ya): 


Comparing coordinates, this holds iff yi = ye = y3 = ya = Ys iff all the y,’s are equal to 
one another. So | Fix((1,2,3,4,5))| = 7 since there are seven choices for y,. Next, what is 
| Fix(g”)|? We have gy = (1,3, 5, 2,4), so that gy fixes (Y1, Y2,Y3,Y4s Y5) iff 


(Y15 Y2sY3s Y4s U5) = (Yas V5 Y1s Y2s Y3)> 


which holds iff y; = y3 = ys = y2 = ya. So | Fix(g?)| = 7. Similarly, | Fix(g?)| = | Fix(g*)| = 
7, so the answer is 
PH+T+AT+HT+7 


= 3367. 
5 
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Now suppose we are counting six-bead necklaces, identifying all rotations of a given 
necklace. Here, the group of symmetries is 


G = {id, (1,2,3,4,5,6), (1,3,5)(2, 4,6), (1, 4)(2,5)(8,6), (1,5,3)(2, 6,4), (1,6,5, 4,3, 2)}. 


As before, id has 7° fixed points, and each of the two six-cycles has 7 fixed points. What 
about Fix((1,3,5)(2,4,6))? We have 


(1,3, 5)(2, 4, 6) * (Y1, Y25Y3> Yas Y5, Ye) = (Y55Y6s Yi» Y2 Y3>YA)s 


and this equals (y1,...,y6) iff y1 = ys = ys and yo = ya = ye. Here there are 7 choices 
for y1, 7 choices for y2, and the remaining y;’s are then forced. So | Fix((1,3,5)(2,4,6))| = 
7?. Likewise, | Fix((1,5,3)(2,6,4))| = 7?. Similarly, we find that (y1,...,y6) is fixed by 
(1, 4)(2,5)(3, 6) iff yr = ys and y2 = ys and ys = ye, so that there are 7° such fixed points. 
In each case, Fix(f) turned out to be 7°%°() where cyc(f) is the number of cycles in the 
complete cycle decomposition of f (including 1-cycles). The number of necklaces is 


M474 7P4 P4747 


= 19, 684. 
6 y] 


Now consider the question of counting five-bead necklaces using g types of beads, where 
rotations and reflections of a given necklace are considered equivalent. For this problem, the 
group of symmetries to use is the automorphism group of the cycle graph Cs (see 9.65). In 
addition to the five powers of (1, 2,3,4,5), this group contains the following five permuta- 
tions corresponding to reflections of the necklace: 


(1, 5)(2,4)(3), (1,4)(2,3)(5), (1,8)(4,5)(2), (1, 2)(3, 5)(4), (2, 5)(3, 4)(1). 


The reader may check that each of the five new permutations has q? = q°%°) fixed points. 
For example, a necklace (y1,...,Yy5) is fixed by (1,5)(2,4)(3) iff y1 = ys (q choices) and 
y2 = ya (q choices) and y3 is arbitrary (q choices). So, the number of necklaces is 


g? +5q° + 4q! 
10 ; 


The following general example can be used to solve many counting problems involving 
symmetry. 


9.142. Example: Counting Colorings under Symmetries. Suppose V is a finite set of 
objects, C is a finite set of g colors, and G C Sym(V) is a group of symmetries of the objects 
V. (For example, if V is the vertex set of a graph, we could take G to be the automorphism 
group of the graph.) G acts on V via g-x = g(x) for g € Gand x € V. Now let X = YC be 
the set of all functions f : V — C. We think of a function f as a coloring of V such that 
x receives color f(x) for all « € V. As we saw in 9.88, G acts on X via g* f = fog for 
g€ Gand f € X. Informally, if f assigns color c to object x, then g * f assigns color c to 
object g(a). The G-orbits consist of colorings that get identified when we take into account 
the symmetries in G. So the number of colorings “up to symmetry” is Tel Ygeq | Fix(g)|. In 


the previous example, we observed that | Fix(g)| = q°%. To see why this holds in general, 
fix g € G and write a complete cycle decomposition g = C1C2---Cz, so k = cyc(g). Let V; 
be the elements appearing in cycle C;, so V is the disjoint union of the sets V;. Consider 
C,, for example. Say C, = (#1,%2,...,@5), so that Vj = {x1,...,x2,}. Suppose f € X is 
fixed by g, so f =g* f. Then 


f (x2) = (9 * f)(a2) = f(g" (w2)) = f(a). 
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Similarly, f(v3) = f(v2), and in general f(x;41) = f(a;) for all 7 < s. It follows that f is 
constant on V;. Similarly, f is constant on every V; in the sense that f assigns the same color 
to every x € V;. This argument is reversible, so Fix(g) consists precisely of the colorings 
f ¢ “C that are constant on each V;. To build such an f, choose a common color for all 
the vertices in V; (for 1 < i < k). By the product rule, | Fix(g)| = q* = q°% as claimed. 
Therefore, the answer to the counting problem is 


1 
a So ge). (9.8) 


gEG 


9.143. Example: Counting Chessboards. We now answer the question posed at the 
beginning of this chapter: how many ways can we color a 5 x 5 chessboard with seven 
colors, if all rotations and reflections of a given colored board are considered the same? We 
apply the method of the preceding example. Let B = (V, E) be the graph that models the 
chessboard (Figure 9.3). Let C = {1,2,...,7} be the set of colors, and let X = YC be the 
set of colorings before accounting for symmetry. The symmetry group G = Aut(B) was 
computed in 9.67. By inspecting the cycle decompositions for the eight elements g € G, the 
answer follows from (9.8): 


at wl Saki a Ge 


3 = 167, 633, 579, 843, 887, 699, 759. 


DT 
9.19 Pdlya’s Formula 


Consider the following variation of the chessboard coloring example: how many ways can 
we color a 5 x 5 chessboard so that 10 squares are red, 12 are blue, and 3 are green, if all 
rotations and reflections of a colored board are equivalent? We can answer questions like 
this with the aid of “Pélya’s Formula,” which extends Burnside’s Lemma to weighted sets. 

Let a finite group G act on a finite set X. Let {O1,...,On} be the orbits of this action. 
Suppose each x € X has a weight wt(x) in some polynomial ring R, and suppose that the 
weights are G-invariant: wt(g * x) = wt(«) for all g € G and all « € X. This condition 
implies that every object in a given G-orbit has the same weight. So we can assign a well- 
defined weight to each orbit by letting wt(O;) = wt(«;) for any x; € O;. The next result 
lets us compute the generating function for the set of weighted orbits. 


9.144. Orbit-Counting Theorem for Weighted Sets. With the above notation, 


N 


S¢ wt(Oi) = a »; > wt(x). 


i=1 g€G xEFix(g) 


So, the weighted sum of the orbits is the average over G of the weighted fixed point sets of 
elements of G. 


Proof. We adapt the proof of the original orbit-counting theorem to include weights. Define 
f :X —R by setting f(x) = wt(x)/|Ga| for each x € X. On one hand, 


N 


a wt(z) = wt(O;) = 
Sfo=y yaa yy, om = Tar = S-wt(0,). 


wex i=1 ©EO; i=1 EO; i=1 EO; i=1 
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On the other hand, using |G'z| = |G|/| Stab(x)|, we get 


Efe) = yD alo = 2) wt) 


rex TEX xeEX gEG 


1 1 
= q@ QL Lxor=2)wte)= Gd DI we). O 


geGrEex g€G xe Fix(g) 


We now extend the setup of 9.142 to count weighted colorings. We are given finite sets 
V and C = {1,...,q}, a subgroup G of Sym(V), and the set of colorings X = YC. G acts 
on X by permuting the domain: g* f = fog 7! for g € Gand f € X. We define a weight 
for a given coloring by setting 


wt(f) = II Bite © RiP 04 see heh 
wey 


Note that wt(f) = z{!--- 29% iff f colors e; of the objects in V with color i. We see that 


wt(g* f) = I] zro-2.@) = [TL erm = wth) (g€G,f eX) 


rEV vEV 


by making the change of variable v = g~!(x). So the weighted orbit-counting theorem is 
applicable. In the unweighted case (see 9.142), we found that | Fix(g)| = q°% by arguing 
that f € Fix(g) must be constant on each connected component Vj,...,V, of the digraph of 
the permutation g. To take weights into account, let us construct such an f using the product 
rule for weighted sets. Suppose the components Vj, V2,..., Vz have sizes ny > ng >--: > ng 
(so that type(g) = (m1, n2,...,7%)). First choose a common color for the n, vertices in Vj. 
The generating function for this choice is z{* + 237 +--+: + 271; the term zj'' arises by 
coloring all n; vertices in V; with color 7. Second, choose a common color for the n2 vertices 
in Vj. The generating function for this choice is z/? +--- + 27?. Continuing similarly, we 
arrive at the formula 


k 


S- wt(x) = [@ +o, eee) 


w€ Fix(g) i=1 


We can abbreviate this formula by introducing the power-sum polynomials (which are stud- 
ied in more detail in Chapter 10). For each integer k > 1, set pp(z1,..-,2q) = 2P +28 +-+°+ 


ae For each integer partition ps = (t1, e,..-, Me), Set Pu(Z1,---,%q) = ae Pu; (21, +++, 2q) 


Then the weighted orbit-counting formula assumes the following form. 


9.145. Pélya’s Formula. With the above notation, the generating function for weighted 
colorings with gq colors relative to the symmetry group G is 


N 
1 
S/ wt (Oi) = 1G] S > Ptype(g) (21 22; -++)2q) € R[zi,..-, 2g]. 
i=l geEG 


The coefficient of z{1 - ++ zq% in this polynomial is the number of colorings (taking the sym- 
metries in G into account) in which color i is used e; times. 


9.146. Example. The generating function for five-bead necklaces using q types of beads 
(identifying all rotations and reflections of a given necklace) is 


(P(1,1,1,1,1) + 4P(5) + 5P2,2,1))/10, 


where all power-sum polynomials are evaluated at (z1,..., Zq). 
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9.147. Example. Let us use Polya’s formula to count 5 x 5 chessboards with 10 red 
squares, 12 blue squares, and 3 green squares. We may as well take gq = 3 here. Consulting 
the cycle decompositions in 9.67 again, we find that the group G has one element of type 
(17°) = (1,1,---,1), two elements of type (4°,1), one element of type (2!%,1), and four 
elements of type (2!°, 1°). Therefore, ae wt(O;) is given by 


(128) (21, 22, 23) + 24s 1) (21, 22, 23) + P(212,1) (215 22, 23) + 4p(210,15) (21, 22; 23) 

ee ee Se 
Using a computer algebra system, we can compute this polynomial and extract the coeff- 
cient of 21923723. The final answer is 185,937, 878. 


i E—=E 


Summary 


Table 9.2 summarizes some definitions from group theory used in this chapter. Table 9.3 
contains definitions pertinent to the theory of group actions. 


e Examples of Groups. (i) additive commutative groups: Z, Q, R, C, and Z, = 
{0,1,...,2— 1} (under addition modulo n); (ii) multiplicative commutative groups: 
invertible elements in Z, Q, R, C, and Z,,; (iii) non-commutative groups: invertible ma- 
trices in M,(R), the group Sym(X) of bijections on X under composition, dihedral 
groups (automorphism groups of cycle graphs); (iv) constructions of groups: product 
groups (see 9.153), subgroups, quotient groups (see 9.205), cyclic subgroup generated 
by a group element, automorphism group of a graph, automorphism group of a group. 


e Basic Properties of Groups. The identity of a group is unique, as is the inverse of each 
group element. In a group, there are left and right cancellation laws: (ax = ay) > (a = 
y) and (xa = ya) > (x = y); inverse rules: (x~!)~! = and (a1 ---ap)~! =a7}---2['; 

and the laws of exponents: 2’t" = x™ax™; (2™)" = a2™"; and, when ry = yz, (xy)” = 

xy”. 


e Notation for Permutations. A bijection f € S, can be described in two-line form 


1 2 eee nm 
, in one-line form |[f(1), f(2),..., f(m)], or in cycle notation. 
Co Heise ak) (71), F2),---5f(n)]), or in ey 
The cycle notation is obtained by listing the elements going around each directed cycle 
in the digraph of f, enclosing each cycle in parentheses, and optionally omitting cycles 


of length 1. The cycle notation for f is not unique. 


e Sorting, Inversions, and Sign. A permutation w = wyw2--:Wn € S, can be sorted 
to the identity permutation id = 12---n by applying inv(w) basic transpositions to 
switch adjacent elements that are out of order. It follows that w can be written as the 
composition of inv(w) basic transpositions. Any factorization of w into a product of 
transpositions must involve an even number of terms when sgn(w) = +1, or an odd 
number when sgn(w) = —1. Sign is a group homomorphism: sgn(f og) = sgn(f)-sgn(g) 
for f,g € Sp. The sign of a k-cycle is (—1)*~1. For all f € Sn, sgn(f) = (-1I)"- 9). 


e Properties of Determinants. The determinant of a matrix A € M,(R) is an R- 
multilinear, alternating function of the rows (resp. columns) of A such that det(J,) = 1p. 
This means that det(A) is an R-linear function of any given row when the other rows 
are fixed, and the determinant is zero if A has two equal rows; similarly for columns. 
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TABLE 9.2 
Definitions in group theory. 


Definition 
Va,y € G,xxy € G (closure 
Vu,y,z€ G,ux (yx z) = (x xy) «2 (associativity) 
de €G,Vx € G,xxe=x =exx (identity) 
Vo € G,Ay € G,rxy =e = yz (inverses) 
group G with xy = yz for allz,yEeG 

eg € H (closure under identity) 

Va, b € H,ab € H (closure under operation) 

Va € H,a~' € H (closure under inverses) 
Vg € G,Wh € H,ghg~! € H (closure under conjugation) 
Se gS ae ee 0) 
Or = 0g, (n+ 1a =nar+a, (—n)x = n(—2z) (n => 0) 


Concept 


group axioms for (Gx) 


commutative group 


HT is a subgroup of G 


Hf is normal in G (H IG) 
exponent notation in (G,-) 
multiple notation in (G, +) 


k-cycle f € Sym(X) of the form (71, i2,--+ ,ix) (cycle notation) 
transposition a 2-cycle (i, 7) 

basic transposition a 2-cycle (i,i +1) in S, 

cyc(f) number of components in digraph of f € Sym(X) 
type(f) list of cycle lengths of f € Sym(X) in decreasing order 


inv(wy +--+ Wn) 

sgn(w) for w € Sy 
determinant of A € M,,(R) 
classical adjoint adj(A) 
cyclic subgroup (x) 

cyclic group 

order of x €G 

graph automorphism of 
group homomorphism 
kernel of hom. f : G— H 
image of hom. f:G— H 
group isomorphism 

group automorphism 
inner automorphism C, 


number of 7 < 7 with w; > w; 

| inv(w) 

det(A) = Ves, 880(w) Ti AG, w(7)) 

adj(A)i,j = (1)? det(Aljli)) 

{z” :n € Z} or {nz: ne Z} (additive notation) 
group G such that G = (x) for some x € G 

least n > 0 with x” = eg, or oo if no such n 
bijection on vertex set of kK preserving edges of K 
map f:G— H with f(ry) = f(x) f(y) for alla,yeG 
ker(f) = {v € G: f(x) = en} 

img(f) ={y€ H: y= f(z) for some x € G} 
bijective group homomorphism 

group isomorphism from G to itself 

the automorphism x +> gxrg~! (g,2 € G) 


We have det(A*) = det(A). For triangular or diagonal A, det(A) = []j_, A(i,7). The 
Laplace expansion for det(A) along row k (resp. column &) is 


n 


det(A) = $0 (-1)** A(k, j) det(A[klj]) = S°(-1)'t* A(i, k) det (A[i|A]). 


j=l i=1 


We have A(adj A) = (det(A))In = (adj A)A, so that A~' = (det(A))~! adj(A) when 
det(A) is invertible in R. If m <n, A ismxn, and B is n x m, the Cauchy-Binet 
formula says 


det(AB) = ay det(A”,..., A?) det(B;,,..., By,,), 


1<ji<jo<-<jm<n 


where A? is the jth column of A, and B; is the jth row of B. In particular, det(AB) = 
det(A) det(B) for A,B € M,,(R). 


e Properties of Cyclic Groups. Every cyclic group is commutative and isomorphic to Z 
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TABLE 9.3 
Definitions in the theory of group actions. 


Concept Definition 
Vg €G,Vxe X,g*xx € X (closure 


action axioms for G-set X Va € X,eg * x = x (identity) 
Vg,heG,Va € X,g* (h* x) = (gh) * x (assoc.) 


perm. representation of Gon X | group homomorphism R: G — Sym(X) 


G-stable subset Y of X Vg € G,Vy € Y,g*y € Y (closure under action) 
orbit of x in G-set X Gr=Grea={gxar: gE Gh 

stabilizer of x rel. to G-set X Stab(z) ={gEG:ger=a2}<G 

fixed points of g in G-set X Fix(g) = {@eE X:gxx=a2} 

conjugacy class of « in G {gxg-':g € G} 

centralizer of x in G Ce(a) ={geG:gr=xg}<G 

center of G Z(G) ={9€G: gx = 2g foralreG}IG 
normalizer of H in G No(H)={g€G:gHg |=H}<G 

left coset of H «cH = {xh:he H} 

right coset of H Hx = {hx:he H} 

set of left cosets G/H for H< G, G/H = {#H: xe G} 

index [G : H] [G : H] = |G/H| =number of left cosets of H in G 


or Z, for some n > 1. More precisely, if G = (x) is infinite, then f : Z — G given by 
f(i) = x for i € Z is a group isomorphism. If G = (x) has size n, then g : Z, — G 
given by g(i) = x* for i € Z, is a group isomorphism; moreover, x” = e iff n divides 
m. Every subgroup of the additive group Z has the form kZ for a unique k > 0. Every 
subgroup of a cyclic group is cyclic. 


Properties of Group Homomorphisms. If f : G — H is a group homomorphism, then 
ker(f) < G and img(f) < H. Moreover, f(#”) = f(a)” for all a € G and n € Z. The 
composition of group homomorphisms (resp. isomorphisms) is a group homomorphism 
(resp. isomorphism), and the inverse of a group isomorphism is a group isomorphism. 


Main Results on Group Actions. Actions x of a group G on a set X correspond bijectively 
to permutation representations R : G — Sym(X), via the formula R(g)(x) = g * x for 
g€Gandz e€ X. Every G-set X is the disjoint union of orbits; more precisely, each 
x € X lies in a unique orbit Gz. The size of the orbit Gz is the index (number of cosets) 
of the stabilizer Stab(x) in G, which (for finite G) is a divisor of |G|. The number of 
orbits is the average number of fixed points of elements of G (for G finite); this extends 
to weighted sets where the weight is constant on each orbit. 


e Examples of Group Actions. A subgroup H of a group G acts on G by left multiplication 
(h * a = hx), and by inverted right multiplication (h * x = xh"), and by conjugation 
(hx x =hxh~'). The orbits of x under these respective actions are the right coset Hz, 
the left coset «H, and (when H = G) the conjugacy class of x in G. Similarly, G (or 
its subgroups) act on the set of all subsets of G by left multiplication, and G acts by 
conjugation on the set of subgroups of G. The set of subsets of a fixed size k are also 
G-sets under these actions. Centralizers of elements and normalizers of subgroups are 
stabilizers under suitable actions, hence subgroups of G. Any subgroup G of Sym(X) 
acts on X by g* x = g(x) for g € G and x € X. For any set X, S,, (or its subgroups) 
acts on X” via f - (11,...,2n) = (@f-1(1),---,%f-1(m)). For any subgroup H of G, G 
acts on G/H via g * (cH) = (gx)H for g,x EG. 
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e Facts about Cosets. Given a subgroup H of a group G, G is the disjoint union of its left 
(resp. right) cosets, which all have the same cardinality as H. This implies Lagrange’s 
theorem: |G| = |H|-[G : H], so that (for finite G), the order and index of any subgroup 
of G are both divisors of |G|. To test equality of left cosets, one may check any of the 
following equivalent conditions: cH = yH; x € yH; x = yh for some h € H; y-'x € H; 
x ty € H. Similarly, Hx = Hy iff ry~! € H iff yx! € H. Left and right cosets coincide 
(ie., eH = Hx for all x € G) iff H is normal in G iff all conjugates xHx~' equal H iff 
His a union of conjugacy classes of G. Given a group homomorphism f : G — L with 
kernel K, Kan = xk = {yeG: f(y) = f(x)} for all xe G. 


e Conjugacy Classes. Every group G is the disjoint union of its conjugacy classes, where 
the conjugacy class of x is {gxg~! : g € G}. Conjugacy classes need not all have the 
same size. The size of the conjugacy class of x is the index [G : Cg(x)], where Cg(zx) is 
the subgroup {y € G: xy = yz}; this index is a divisor of |G| for G finite. For x € G, 
the conjugacy class of x has size 1 iff x is in the center Z(G). This can be used to show 
that groups G of size p” (where p is prime and n > 1) have |Z(G)| > 1. Each conjugacy 
class of S, consists of those f € S,, with a given cycle type  € Par(n). This follows 
from the fact that the cycle notation for gfg~+ is the cycle notation for f with each 
value x replaced by g(x). The size of the conjugacy class indexed by yp is n!/z,. 


e Cayley’s Theorem on Permutation Representations. Every group G is isomorphic to a 
subgroup of Sym(G), via the homomorphism sending g € G to the left multiplication 
Lg = (a gx: a € G). Every n-element group is isomorphic to a subgroup of Sp. 


e Theorems Provable by Group Actions. (i) Fermat’s Little Theorem: a? = a (mod p) for 
a € Nt and p prime. (ii) Cauchy’s Theorem: If G is a group and p is a prime divisor of 
|G|, then there exists « € G of order p. (iii) Lucas’ Congruence: For 0 < k < n and prime 
P, (i) = Tliso (;') (mod p), where the n; and k; are the base-p digits of n and k. (iv) 
Sylow’s First Theorem: If G is a group and |G| has prime factorization |G| =p?’ ---p;*, 
then G has a subgroup of size p;* for 1 <i<k. 


e Counting Colorings under Symmetries. Given a finite set V, a group of symmetries 
G < Sym(V), and a set C of qg colors, the number of colorings f : V — C taking 
symmetries into account is |G|~+ Dee q°°°(9). If the colors are weighted using 21,..., Zq, 
the generating function for weighted colorings is given by Pélya’s formula 


1 
fell a Ptype(g) (215 +++ +2q)s 


gEG 


where p, is a power-sum symmetric polynomial. The coefficient of z{1--- 29’ gives the 
number of colorings (taking the symmetries in G into account) where color i is used e; 
times. 


Exercises 


9.148. Let X be a set with more than one element. Define a x b = 0 for all a,b € X. (a) 
Prove that (X,x) satisfies the closure axiom and associativity axiom in 9.1. (b) Does there 
exist e € X such that ex x = x for all x € X? If so, is this e unique? (c) Does there exist 
e€ X such that «xe = x for all x € X? If so, is this e unique? (d) Is (X,*) a group? 
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9.149. Let G be the set of odd integers. For all x,y € G, define xx y=ax+y+5. Prove 
that (G,*) is a commutative group. 


9.150. Let G be the set of real numbers unequal to 1. For each a,b € G, define ax b = 
a+b-—ab. Prove that (G,x) is a commutative group. 


9.151. Assume (G,x) is a group such that x *« a = e for all « € G, where e is the identity 
element of G. Prove that G is commutative. 


9.152. Let (G,«) be a group. Define e: G x G > G by setting ae b = bxa for alla,bE G. 
Prove that (G,e) is a group. 


9.153. Product Groups. Let (G,«) and (H,e) be groups. (a) Show that G x H becomes 
a group under the operation (g1, h1) * (go, ho) = (g1 * g2,h1 he) for g1,g2 € G, hi, he € H. 
(b) Show G x H is commutative iff G and H are commutative. 


9.154. Prove the associative axiom for (Z,,,®) by verifying (9.1). 


9.155. Suppose G is a set, x: G x G — G is associative, and there exists e € G such that 
for alla € G,exx =< and there is y € G with yx x =e. Prove (G,x) is a group. 


9.156. For x,y in a group G, define the commutator [x,y] = xyx~ty~!, and let C,(y) = 
xyx + (conjugation by x). Verify that the following identities hold for all x,y,z € G: 
(a) [x,y]“* = [ya]; (b) [z,y2] = [2,y]Cy([z,2]); (©) [z,ye]ly, 2a][z,2y] = ec; (d) 
[[x, 9], Cy (2) lly, 2], Cz (a) [lz, 2], Cz (y)] = ec. 


9.157. Give complete proofs of the three laws of exponents in 9.10. 


9.158. Let G be a group. For each g € G, define a function Rg : G — G by setting 
R,(«) = xg for each x € G. R, is called “right multiplication by g.” (a) Prove that R, is one- 
to-one and onto. (b) Prove that R. = idg (where e¢ is the identity of G) and Ryo Rp, = Rng 
for all g,h € G. (c) Point out why R, is an element of Sym(G). Give two answers, one based 
on (a) and one based on (b). (d) Define ¢ : G — Sym(G) by setting ¢(g) = R, for g € G. 
Prove that ¢ is one-to-one. (e) Prove that for all g,h € G, Lgo Rp, = Rpo L, (where L, is 
left multiplication by g). 


9.159. Let G be a group. (a) Prove that for all a,b € G, there exists a unique « € G 
with ax = b. (b) Prove that in the multiplication table for a group G, every group element 
appears exactly once in each row and column. 


9.160. A certain group (G, x) has a multiplication table that has been partly filled in below: 


123 4 
4 


wn | + 
_ 


Use properties of groups to fill in the rest of the table. 


9.161. Let f,g € Sg be given in one-line form by f = [3,2,7,5,1,4,8,6] and g = 
(4,5, 1,3, 2,6, 8, 7]. (a) Write f and g in cycle notation. (b) Compute fog, gof, gog, and 
f—', giving final answers in one-line form. 


9.162. Let h = [4,1,3,6,5, 2] in one-line form. Compute inv(h) and sgn(h). Write h as a 
product of inv(h) basic transpositions. 


Permutations and Group Actions 369 


9.163. Let f = (1,3,6)(2,8)(4)(5,7) and g = (5,4,3,2,1)(7,8). (a) Compute fg, gf, 
fof—', and gfg~', giving all answers in cycle notation. (b) Compute sgn(f) and sgn(g) 
without counting inversions. (c) Find an h € Sg such that hfh~! = (1,2,3)(4,5)(6)(7, 8); 
give the answer in two-line form. 


9.164. Suppose that f € S, has cycle type u = (f11,..., Ux). What is the order of f? 


9.165. The support of a bijection f € Sym(X) is the set supp(f) = {w € X: f(x) £ a}. 
Two permutations f,g € Sym(X) are called disjoint iff supp(f)M supp(g) = 0. (a) Prove 
that for all « € X and f € Sym(X), x € supp(f) implies f(x) € supp(f). (b) Prove that 
disjoint permutations commute, i.e., for all disjoint f,g € Sym(X), fog = gof. (c) Suppose 
f € Sym(X) is given in cycle notation by f = C,C2---C,, where the C; are cycles involving 
pairwise disjoint subsets of X. Show that the C;’s commute with one another, and prove 
carefully that f = C0 Cz 0---0C, (cf. 9.19). 


9.166. Prove 9.27. 


9.167. (a) Verify the formula (a1, 12, seey iz) = (i1, t2) (e) (ia, iz) fe) (43, t4) Orsra'® (tp—1; ik) used 
in the proof of 9.33. (b) Prove that every transposition has sign —1 by finding an explicit 
formula for (i,j) as a product of an odd number of basic transpositions (which have sign 
—1 by 9.26 with w = id). 


9.168. Given f € S,,, what is the relationship between the one-line forms of f and fo (i, 7)? 
What about f and (i,7) 0 f? 


9.169. Let f ¢ S, and h = (i,¢+1)0 f. (a) Prove an analogue of 9.26 relating inv(f) to 
inv(h) and sgn(f) to sgn(h). (b) Use (a) to give another proof of the formula sgn(f o g) = 
sen(f)sgn(g) that proceeds by induction on inv(f). 


9.170. Prove that for n > 3, every f € A, can be written as a product of 3-cycles. 


B 0 
C D 
kxk, Cis (n—k)xk, Dis (n—k) x (n—k), and 0 denotes a k x (n — k) block of zeroes. 
Prove that det(A) = det(B) det(D). 


9.172. Algorithmic Complexity of Determinant Evaluation. Let A © M,(F) where 
F is a field. (a) How many additions and multiplications in F' are needed to compute 
det(A) directly from 9.37? (b) How many additions and multiplications in F’ are needed to 
compute det(A) recursively, using 9.48? (c) Explain how to use 9.41 and 9.47 to compute 
det(A) efficiently (using about cn? field operations for some constant c). 


9.171. Suppose an n x n matrix A is given in block form as A = | where B is 


9.173. Permanents. The permanent of ann xn matrix A € M,,(R) is defined as per(A) = 
wes, Hii: Al, w(i)). Prove the following facts about permanents: (a) per(A‘) = per(A); 
(b) if A is diagonal, then per(A) = [J_, A(¢,7); (c) per(In) = 1p; (d) per(A) is an R- 
multilinear function of the rows and columns of A (cf. 9.45); (e) if B is obtained from A by 
permuting the rows in any fashion, then per(B) = per(A). 

9.174. State and prove analogues for permanents of the Laplace expansions in 9.48. 
9.175. Verify the characterization of R-linear maps stated in 9.44. 

9.176. Complete the proof of 9.50 by showing that (adj A)A = det(A)In. 


9.177. Cramer’s Rule. Let A € M,,(R) where det(A) is invertible in R, let b be a given 
n x 1 vector, and let x = [x, +--+ x,]’. Show that the unique solution of the linear system 
Az = b is given by x; = det(A;)/det(A), where A; is the matrix obtained from A by 
replacing the ith column by 6. 
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9.178. Verify the Cauchy-Binet formula for the matrices 


2 103 ae ye 

A=}]1 -1 1 2 B= 
4. 0 2 2]. ee 
-2 0 -1 


9.179. Consider a function w : {1,2,...,k} — {1,2,...,n}, which we regard as a word 
Ww = wiw2:::wr. Show that there exist basic transpositions t1,...,tm ©€ S_% such that 
wo(t tg -+-tm) is a weakly increasing word, and the minimum possible value of m is inv(w) = 
icy X(wWi > wy). 


9.180. Let A and B be n x n matrices. Prove that det(AB) = det(A) det(B) by imitating 
(and simplifying) the proof of the Cauchy-Binet formula 9.53. 


9.181. Verify all the assertions in 9.57. 


9.182. Let x be an element of a group G, written multiplicatively. Use the laws of exponents 
to verify that (xz) = {x” : n € Z} is a subgroup of G, as stated in 9.60. 


9.183. Subgroup Generated by a Set. Let S be a nonempty subset of a group G. Let 
(S') be the set of elements of G of the form 21%2--+2%,, where n € N* and, for 1 <i <n, 
either x; € S or a; ' € S. Prove that (S) < G, and for all T with S C T < G, (S) <T. 


9.184. Prove that every subgroup of a cyclic group is cyclic. 


9.185. For subsets S and T of a multiplicative group G, define ST = {st: 5 € S,t € T}. 
(a) Show that if S<G and T < G, then ST = TS and ST < G. Give an example to show 
ST may not be normal in G. (b) Show that if SG and T<G, then ST IG. (c) Give an 
example of a group G and subgroups S' and T such that ST’ is not a subgroup of G. 


9.186. Let S and T be finite subgroups of a group G. Prove that |S|-|T'| = |ST|-|SQT}. 


9.187. Assume that G is a group and H < G. Let H7! = {h~':h € H}. (a) Show that 
HH =H~!=HH~!=4H. (b) Prove that H IG iff gHg-! = H for allg € G. 


9.188. Show that a subgroup H of a group G is normal in G iff H is a union of conjugacy 
classes of G. 


9.189. Find all the subgroups of $4. Which subgroups are normal? Confirm that Sylow’s 
theorem 9.139 is true for this group. 


9.190. Find all normal subgroups of $5, and prove that you have found them all (Lagrange’s 
theorem and 9.188 can be helpful here). 


9.191. Suppose H is a finite, nonempty subset of a group G such that xy € HA for all 
x,y € H. Prove that H < G. Give an example to show this result may not be true if H is 
not finite. 


9.192. Given any simple graph or digraph K with vertex set X, show that Aut(K) is a 
subgroup of Sym(X). 


9.193. Determine the automorphism groups of the following graphs and digraphs: (a) the 
path graph P,, (see 3.124); (b) the complete graph K,, (see 3.124); (c) the empty graph on 
{1,2,...,n} with no edges; (d) the directed cycle with vertex set {1,2,...,n} and edges 
(n, 1) and (i,i + 1) for i < n; (e) the simple graph with vertex set {+1,+2,...,1n} and 
edge set {{7,-i}:1<i<n}. 
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9.194. Let K be the Petersen graph (defined in 3.215). (a) Given two paths P = 
(Yo; ¥1; Y2, ¥3) and Q = (Zo, 21, 22, 23) in K, prove that there exists a unique automorphism 
of K that maps y; to z; for 0 <i < 3. (b) Prove that K has exactly 5! = 120 automorphisms. 
(c) Is Aut(A’) isomorphic to $5? 


9.195. Let Q, be the simple graph with vertex set V = {0,1}* and edge set E = {(v,w) € 
V : v,w differ in exactly one position}. Q, is called a k-dimensional hypercube. (a) Compute 
lV(Qx)|, |E(Qzx)|, and deg(Q;). (b) Show that Q, has exactly ee induced subgraphs 
isomorphic to Q;. (c) Find all the automorphisms of Q;. How many are there? 


9.196. (a) Construct an undirected graph whose automorphism group has size three. What 
is the minimum number of vertices in such a graph? (b) For each n > 1, construct an 
undirected graph whose automorphism group is cyclic of size n. 


9.197. Let G be a simple graph with connected components C},...,C;. What is the relation 
between | Aut(G)| and (| Aut(C;)|:1<7<k)? 


9.198. Let f : G— H be a group homomorphism. (a) Show that if K < G, then f[K] = 
{f(x) : « € K} is a subgroup of H. If K <G, must f[K] be normal in H? (b) Show 
that if L < H, then f~'[L] = {@ € G: f(x) € L} is a subgroup of G. If L dH, must 
f~*[L] be normal in G? (c) Deduce from (a) and (b) that the kernel and image of a group 
homomorphism are subgroups. 


9.199. Show that the group of nonzero complex numbers under multiplication is isomorphic 
to the product of the subgroups Rt and {z € C: |z| = 1}. 


9.200. Give examples of four non-isomorphic groups of size 12. 


9.201. Suppose G is a commutative group with subgroups H and K, such that G = Hk 
and HN K = {eg}. (a) Prove that the map (h,k) +> hk is a group isomorphism from H x K 
onto G. (b) Does any analogous result hold if G is not commutative? What if H and K are 
normal in G? 


9.202. (a) Let G be a group and z € G. Show there exists a unique group homomorphism 
f:Z—G with f(1) =~. (b) Use (a) to determine the group Aut(Z). 


9.203. (a) Suppose G is a group, « € G, and 2” = eg for some n > 2. Show there exists a 
unique group homomorphism f : Z, — G with f(1) = x. (b) Use (a) to prove that Aut(Zn) 
is isomorphic to the group Z> of invertible elements of Z, under multiplication modulo n. 


9.204. Properties of Order. Let G be a group and z € G. (a) Prove x and x~' have the 
same order. (b) Show that if 2 has infinite order, then so does x’ for all nonzero integers 
i. (c) Suppose « has finite order n. Show that the order of x* is n/ ged(k,n) for all k € Z. 
(d) Show that if f : G — H is a group isomorphism, then x and f(a) have the same order. 
What can be said if f is only a group homomorphism? 


9.205. Quotient Groups. (a) Suppose H is a normal subgroup of G. Show that the set 
G/H of left cosets of H in G becomes a group of size [G : H] if we define (xH) « (yH) = 
(xy)H for all x,y € G. (One must first show that this operation is well-defined: i.e., for 
all 21,%2,y1,y2 € G, 1H = voH and yA = yoH imply 21y,H = xoy2H. For this, 
use the coset equality theorem.) (b) With the notation in (a), define 7: G — G/H by 
a(x) = ©H for x € G. Show that 7 is a surjective group homomorphism with kernel H. (c) 
Let H = {id, (1,2)} < $3. Find 21,22, y1, y2 € $3 with 71H = x2H and y:H = ywH, but 
xiyil 4 x2y2H. This shows that normality of H is needed for the product in (a) to be well 
defined. 
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9.206. Let H be a normal subgroup of a group G. (a) Prove that G/H is commutative if 
G is commutative. (b) Prove that G/H is cyclic if G is cyclic. (c) Does the converse of (a) 
or (b) hold? Explain. 


9.207. Fundamental Homomorphism Theorem for Groups. Suppose G and H are 
groups and f : G > H is a group homomorphism. Let K = { € G: f(x) = ex} be the 
kernel of f, and let J = {y © H: dx € G,y = f(x)} be the image of f. Show that kK JG, 
I < H and there exists a unique group isomorphism f : G/K — I given by f(xK) = f(z) 
for x EG. 


9.208. Universal Mapping Property for Quotient Groups. Let G be a group with 
normal subgroup N, let 7 : G > G/N be the homomorphism 7(x) = aN for « € G, and 
let H be any group. (a) Show that if h : G/N — H is a group homomorphism, then ho x 
is a group homomorphism from G to H sending each n € N to eg. (b) Conversely, given 
any group homomorphism f : G — H such that f(n) = ey for all n € N, show that there 
exists a unique group homomorphism h : G/N — H such that f = hoz. (c) Conclude that 
the map ht> hoz is a bijection from the set of all group homomorphisms from G/N to H 
to the set of all group homomorphisms from G to H that map everything in N to eq. 


9.209. Diamond Isomorphism Theorem for Groups. Suppose G is a group, SG, 
and T < G. Show (cf. 9.185) TS = ST < G, SATS, (SANT) AT, and there is a well-defined 
group isomorphism f :T/(SMT) — (TS)/S given by f(a(SNT)) = 2S for all « € T. Use 
this to give another solution of 9.186 in the case where S' is normal in G. 


9.210. Double-Quotient Isomorphism Theorem for Groups. Assume A< B<C 
are groups with A and B both normal in C.. Show that ASB, B/A<IC/A, and (C/A)/(B/A) 
is isomorphic to C/B via the map («#A)B/At «B for r EC. 


9.211. Correspondence Theorem for Quotient Groups. Let H be anormal subgroup 
of a group G. Let X be the set of subgroups of G containing H, and let Y be the set of 
subgroups of G/H. Show that the map L'> L/H = {«H : x € L} is an inclusion-preserving 
bijection of X onto Y with inverse M+ {x € G: cH € M}. If L maps to M under this 
correspondence, show that [G : L] = [G/H : M], that [L : H] = |M|, that DIG iff 
M G/H, and that G/L is isomorphic to (G/H)/M whenever L <G. 


9.212. Let G be a non-commutative group. Show that the rule g- « = xg (for g,x € G) 
does not define a left action of G on the set G. 


9.213. Let G act on itself by conjugation: g*x = gxg~' for g,x € G. Verify that the axioms 
for a left group action are satisfied. 


9.214. Let (X,*) be a G-set and f : K — G a group homomorphism. Verify the K-set 
axioms for the action ke x = f(k)*a (ke K, ve X). 


9.215. Suppose * : G x X — X is a group action. (a) Show that P(X) is a G-set via the 
action ge S ={g*xs:s¢€S} forge Gand S € P(X). (b) For fixed k, show that the set of 
all k-element subsets of X is a G-stable subset of P(X). 


9.216. Verify the action axioms for the action of S, on V in 9.87. 


9.217. Suppose (X, *) is a G-set and W is a set. Show that the set of functions F : W = X 
is a G-set via the action (ge F)(w) = g * (F(w)) for alge G, FE VX, andw ce W. 


9.218. Let a subgroup H of a group G act on G viah*a = ah! forh€ H andxveG. 
Show that the orbit H * x is the left coset eH, for x € G. 
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9.219. (a) Suppose f : X — Y is a bijection. Show that the map T : Sym(X) — Sym(Y) 
given by T(g) = fogo f+ for g € Sym(X) is a group isomorphism. (b) Use (a) and Cayley’s 
theorem to conclude that every n-element group is isomorphic to a subgroup of S,. 


9.220. Let a group G act on a set X. Show that (),.. Stab(«) is a normal subgroup of G. 
Give an example to show that a stabilizer subgroup Stab(#) may not be normal in G. 


9.221. Let G act on itself by conjugation. (a) By considering the associated permuta- 
tion representation and using the fundamental homomorphism theorem 9.207, deduce that 
G/Z(G) is isomorphic to the subgroup of inner automorphisms in Aut(G). (b) Show that 
the subgroup of inner automorphisms is normal in Aut(G). 


9.222. Let (IR, +) act on R? (viewed as column vectors) by the rule 


x cos@ —sin@ zr 
oso =| sin 0 | 4 (0,2,y € R). 


Verify that this is an action, and describe the orbit and stabilizer of each point in R?. 


9.223. Let f € S,, and let (f) act on {1,2,...,n} via g- x = g(a) for all g € (f) and 
x € {1,2,...,n}. Prove that the orbits of this action are the connected components of the 
digraph of f. 


9.224. Suppose X is a G-set and z,y € X. Without appealing to equivalence relations, 
give a direct proof that Grn Gy 4 0 implies Gr = Gy. 


9.225. Let * be a right action of a group G on aset X. (a) Prove that X is the disjoint union 
of orbits x * G. (b) Prove that |x * G| = [G : Stab(a)], where Stab(z) = {g € G: a*xg =x}. 


9.226. State and prove a version of the coset equality theorem 9.111 for right cosets. 


9.227. Let G be a group with subgroup H. Prove that the map T(#H) = Hx~! for x € G 
is a well-defined bijection from the set of left cosets of H in G onto the set of right cosets 
of H in G. 


9.228. Let X be a G-set. For x € X and g € G, prove that gStab(z) = {he G:h*ex= 
g* x}. (This shows that each left coset of the stabilizer of « consists of those group elements 
sending x to a particular element in its orbit Gz. Compare to 9.120.) 


9.229. Let G be a group with subgroup H. Prove the following facts about the normalizer 
of H in G (see 9.126). (a) Ng(H) contains H; (b) H < Ne(H); (c) for any L < G such that 
H<AL,L< Ne(A); (d) 4G iff Ne(A) =G. 


9.230. Let X be a G-set. Prove: for g € Gand x € X, Stab(gx) = gStab(x)g7". 


9.231. Let H and K be subgroups of a group G. Prove that the G-sets G/H and G/K are 
isomorphic (as defined in 9.129) iff H and K are conjugate subgroups of G (i.e., K = gHg™! 
for some g € G). 


9.232. Calculate z,, for every  € Par(6). 


9.233. Explicitly write down all elements in the centralizer of g = (2,4, 7)(1,6)(3, 8)(5) € 
Sg. How large is this centralizer? How large is the conjugacy class of g? 


9.234. Suppose f = (2,4,7)(8, 10, 15)(1, 9)(11, 12)(17, 20)(18, 19) and 
g = (7,8, 9)(1,4,5)(11, 20)(2, 6)(3, 18)(13, 19). How many h € Syo satisfy ho f =goh? 
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9.235. Find all integer partitions y of n for which z,, = n!. Use your answer to calculate 
Z(S,) for alln > 1. 


9.236. Prove that for all n > 1, p(n) = 4 Fes, Pipe fy 


9.237. Conjugacy Classes of A,. For f € An, write [f]a, (resp. [f]s,,) to denote the 
conjugacy class of f in A, (resp. S,,). (a) Show that [f]4, © [f]s, for all f € An. (b) Prove: 
for all f € An, if there exists g € S, ~ Ap with fg = gf, then [f]4, = [f]s,,; but if no such 
g exists, then [f]s5, is the disjoint union of [f].4, and [(1,2)° fo (1, 2)]4,, and the latter two 
conjugacy classes are equal in size. (c) What are the conjugacy classes of As? How large are 
they? Use this to prove that As is simple, i.e., the only normal subgroups of As are {id} 
and As. 


9.238. Suppose G is a finite group and p is a prime divisor of |G|. Show that the number 
of elements in G of order p is congruent to —1 (mod p). 


9.239. (a) Compute Ce mod 7. (b) Compute Ca) mod 10. 


9.240. Prove 9.138 without using Lucas’ congruence, by counting powers of p in the nu- 
merator and denominator of (a) = (p%b) | pa /(p%)!. 


9.241. Class Equation. Let G be a finite group with center Z(G) (see 9.125), and let 
%1,...,%% € G be such that each conjugacy class of G of size greater than 1 contains exactly 
one z;. Prove that |G| = |Z(G)| nT s [G : Cg(2;)], where each term in the sum is a divisor 
of |G| greater than 1. 


9.242. A p-group is a finite group of size p® for some e > 1. Prove that every p-group G 
has |Z(G)| > 1. 


9.243. Wilson’s Theorem. Use group actions to prove that if an integer p > 1 is prime, 
then (p — 1)! = —1 (mod p). Is the converse true? 


9.244. How many ways are there to color an n x n chessboard with gq possible colors if: 
(a) no symmetries are allowed; (b) rotations of a given board are considered equivalent; (c) 
rotations and reflections of a given board are considered equivalent? 


9.245. Consider an m x n chessboard where m # n. (a) Describe all symmetries of this 
board. (b) How many ways can we color such a board with q possible colors? 


9.246. How many n-letter words can be made using a k-letter alphabet if we identify each 
word with its reversal? 


9.247. Consider necklaces that can use q kinds of gemstones, where rotations and reflections 
of a given necklace are considered equivalent. How many such necklaces are there with: (a) 
eight stones; (b) nine stones; (c) n stones? 


9.248. Taking rotational symmetries into account, how many ways can we color the vertices 
of a regular tetrahedron with 7 available colors? 


9.249. Taking rotational symmetries into account, how many ways can we color the vertices 
of a cube with 8 available colors? 


9.250. Taking rotational symmetries into account, how many ways can we color the faces 
of a cube with q available colors? 


9.251. Taking rotational symmetries into account, how many ways can we color the edges 
of a cube with q available colors? 
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9.252. Taking all symmetries into account, how many ways are there to color the vertices 
of the cycle C3 with three distinct colors chosen from a set of five colors? 


9.253. Taking all symmetries into account, how many ways are there to color the vertices 
of the cycle C6 so that three vertices are blue, two are red, and one is yellow? 


9.254. Taking rotational symmetries into account, how many ways are there to color the 
vertices of a regular tetrahedron so that: (a) two are blue and two are red; (b) one is red, 
one is blue, one is green, and one is yellow? 


9.255. Taking rotational symmetries into account, how many ways are there to color the 
vertices of a cube so that four are blue, two are red, and two are green? 


9.256. Taking rotational symmetries into account, how many ways are there to color the 
faces of a cube so that: (a) three are red, two are blue, and one is green; (b) two are red, 
two are blue, one is green, and one is yellow? 


9.257. Taking rotational symmetries into account, how many ways are there to color the 
edges of a cube so that four are red, four are blue, and four are yellow? 


9.258. How many ways can we color a 4x 4 chessboard with five colors (identifying rotations 
of a given board) if each color must be used at least once? 


9.259. How many ways can we build an eight-stone necklace using five kinds of gems 
(identifying rotations and reflections of a given necklace) if each type of gem must be used 
at least once? 


(ie 
Notes 


For a more detailed development of group theory, we recommend the excellent book by 
Rotman [119]. More information on groups, rings, and fields may be found in textbooks on 
abstract algebra such as [29, 70, 71]. Many facts about matrices and determinants, including 
the Cauchy-Binet formula, appear in the matrix theory text by Lancaster [82]. The proof 
of Cauchy’s theorem given in 9.136 is due to McKay [91]. The proof of Lucas’ congruence 
in 9.137 is due to Sagan [120]. The proof of Sylow’s theorem given in 9.139 is usually 
attributed to Wielandt [137], although Miller [92] gave a proof in a similar spirit over 40 
years earlier. Proofs of Fermat’s little theorem and Wilson’s theorem using group actions 
were given by Peterson [103]. 
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Tableaux and Symmetric Polynomials 


In this chapter, we study combinatorial objects called tableaux. Informally, a tableau is a 
filling of the cells in the diagram of an integer partition with labels that may be subject 
to certain ordering conditions. We use tableaux to give a combinatorial definition of Schur 
polynomials, which are examples of symmetric polynomials. The theory of symmetric poly- 
nomials nicely demonstrates the interplay between combinatorics and algebra. We give a 
brief introduction to this vast subject in this chapter, stressing bijective proofs throughout. 


DS 


10.1 Partition Diagrams and Skew Shapes 


The reader may find it helpful at this point to review the basic definitions concerning 
integer partitions (see §2.8). Table 10.1 summarizes the notation used in this chapter to 
discuss integer partitions. In combinatorial arguments, we usually visualize the diagram 
dg(j) as a collection of unit boxes, where (7,7) € dg(j) corresponds to the box in row i 
and column j. The conjugate partition p’ is the partition whose diagram is obtained from 
dg() by interchanging the roles of rows and columns. 

Before defining tableaux, we need the notion of a skew shape. 


10.1. Definition: Skew Shapes. Let p and v be two integer partitions such that dg(v) C 
dg(), or equivalently, v; < m4; for allz > 1. In this situation, we define the skew shape 


p/v = dg(p) ~ dev) = {(4, 7): 1S i < Cu), <5 < mi}. 


We can visualize /v as the collection of unit squares obtained by starting with the 
diagram of y and erasing the squares in the diagram of v. If v = 0 = (0,0,...) is the zero 


TABLE 10.1 
Notation related to integer partitions. 


Definition 
set of integer partitions of k 
number of integer partitions of k 

set of integer partitions of k with at most N parts 

us is an integer partition of k 

the 7th largest part of the partition 4 

the number of nonzero parts of the partition py 

the diagram of pu, ie., {(t,7) ENxN:1<i< (yu), 1 <j < wh} 
conjugate partition to 

the partition with a, parts equal to k for k > 1 


Notation 


bl 
(19120... kt...) 
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partition, then 4/0 = dg(j). A skew shape of the form 4/0 is sometimes called a straight 
shape. 


10.2. Example. If yw = (7,7,3,3,2,1) and v = (5, 2,2,2,1), then 


ply = 
Similarly, 
(6, 5,4, 3, 2)/(3, 3,3) = 


Skew shapes need not be connected; for instance, 


(5, 2, 2, 1)/(3, 2) sam 


The skew shape p/v does not always determine ys and v uniquely; for example, 


LE 


(5, 2,2, 1)/(3, 2) = (5, 3,2, 1)/(3, 3). 
Some special skew shapes will arise frequently in the sequel. 


10.3. Definition: Hooks and Strips. A hook is a skew shape of the form (a,1"~“)/0 
for some a < n. A horizontal strip is a skew shape that contains at most one cell in each 
column. A vertical strip is a skew shape that contains at most one cell in each row. 


10.4. Example. The following picture displays the four hooks of size 4. 


Ar : 
The following skew shapes are horizontal strips of size 4. 
oo 6./@)=p5g A asaayeasy= 


The following skew shapes are vertical strips of size 4. 


[el 
F (4, 3,3, 1)/(8, 2,2) = 7 (7,5, 4, 2)/(6,4, 3,1) = cH 


LJ L 
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10.2 Tableaux 
Now we are ready to define tableaux. 


10.5. Definition: Tableaux. Let u/v be a skew shape, and let X be a set. A tableau of 
shape ./v with values in the alphabet X is a function T: u/v > X. 


Informally, we obtain a tableau from the skew shape y/v by filling each box c € p/v 
with a letter T(c) € X. We often take v = 0, in which case T is called a tableau of shape wp. 
Note that the plural form of “tableau” is “tableaux”; both words are pronounced tab-—loh. 


10.6. Example. The following picture displays a tableau T of shape (5,5,2) with values 


in N: 
[41313] 7] 2) 
[11/3] 9] 1) 
[516] 

Formally, T is the function with domain dg((5,5,2)) such that 


T((1,1)) =4, T(, 2)) = T((1,3)) =3, ..., T((3,1)) = 5, T((3,2)) = 6. 


As another example, here is a tableau of shape (2, 2,2,2) with values in {a, b,c, d}: 


Here is a tableau of shape (3,3,3)/(2,1) with values in Z: 


e | 
Ok 
[4]4P 


In most discussions of tableaux, we take the alphabet to be either {1,2,...,N} for some 
fixed N, or Nt = {1,2,3,...}, or Z. 


10.7. Definition: Semistandard Tableaux and Standard Tableaux. Let T be a 
tableau of shape js/v taking values in an ordered set (X,<). T is semistandard iff 
T((i,9)) < T((t,9 + 1)) for all 1,7 such that (7,7) and (7,7 +1) both belong to u/v; and 
T((i,9)) < T(@+1,9)) for all 7,7 such that (i,j) and (i + 1,7) both belong to p/v. A 
standard tableau is a bijection T : w/v > {1,2,...,n} that is also a semistandard tableau, 
where n = |ju/v|. 


Less formally, a tableau T is semistandard iff the entries in each row of T’ weakly increase 
from left to right, and the entries in each column of T strictly increase from top to bottom. 
A semistandard tableau is standard iff it contains each number from 1 to n exactly once. 
The alphabet X is usually a subset of Z with the usual ordering. Semistandard tableaux 
are sometimes called Young tableaux (in honor of Alfred Young, one of the pioneers in the 
subject) or column-strict tableaua. 


10.8. Example. Consider the following three tableaux of shape (3, 2, 2): 


[1] 113] [1] 2]6] [1215] 
T, =([3]4| To =([3]5| T3 =([3]2] . 
[515] [4] 7] [415] 
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T, is semistandard but not standard; T> is both standard and semistandard; 73 is neither 
standard nor semistandard. T3 is not semistandard because of the strict decrease 3 > 2 in 
row 2, and also because of the weak increase 2 < 2 in column 2. 


10.9. Example. There are five standard tableaux of shape (3,2), namely: 


2 ae ae aie >, -E BE) 
As mentioned in the introduction, there is an amazing formula for counting the number of 


standard tableaux of a given shape yu. This formula is proved in $12.10. 


10.10. Example. Here is a semistandard tableau of shape (6,5, 5,3)/(3, 2, 2): 


It will be convenient to have notation for certain sets of tableaux. 


10.11. Definition: SSYTx(u/v) and SYT(yu/v). For every skew shape p/v and every 
ordered alphabet X, let SSYT x (41/v) be the set of all semistandard tableaux with shape ps/v 
taking values in X. When X = {1,2,...,N}, we abbreviate the notation to SSYTw(p/v). 
Let SYT(js/v) be the set of all standard tableaux of shape pu/v. 


If vy = 0, then we omit it from the notation; for instance, 10.9 displays the five elements 
in the set SYT((3,2)). Observe that SYT(u/v) is a finite set of tableaux. On the other 
hand, SSYT x (y/v) is finite iff X is finite. 


DT 


10.3. Schur Polynomials 


We now introduce a weight function on tableaux that keeps track of the number of times 
each label is used. 


10.12. Definition: Content of a Tableau. Let T be a tableau of shape y/v with values 
in Nt. The content of T is the infinite sequence c(T’) = (ci, c2,...), where cy, is the number 
of times the label k appears in T’. Formally, c, = |{(¢,7) € dg(u) : T((é,7)) = k}|. Every cx, 
is a nonnegative integer, and the sum of all c;’s is |u/v|. Given variables (indeterminates) 
1, %2,..., the content monomial of T is 


cf) — 01.62 Ck — 
27) = gee ..o% = TY ora. 


uep/v 


10.13. Example. Consider the tableaux from 10.8. The content of JT; is c(Ti) = 
(2,0,2,1,2,0,0,...), and the content monomial of T, is x°(™ = «223x422. Similarly, 


aT) = xy 290304052627, g(T3) — £1 L5U3L4Le. 
All five standard tableaux in 10.9 have content monomial 7122732425. More generally, the 
content monomial of any S € SYT(u/v) will be [];_, vi, where n = |u/v|. The tableau S 
shown in 10.10 has content c(S) = (2,1, 2,4,1,0,2,0,0,...). 


Tableaux and Symmetric Polynomials 381 


We can now define the Schur polynomials, which are essentially generating functions for 
semistandard tableaux weighted by content. Recall (§7.16) that Qla1,...,2,/] is the ring of 
all formal polynomials in the variables 71,...,2) with rational coefficients. 


10.14. Definition: Schur Polynomials and Skew Schur Polynomials. For each 
integer N > 1 and every integer partition py, define the Schur polynomial for pp in N 


variables by setting 
Sy(11,--.,0N) = S- a), 
TESSYT y(n) 


More generally, for any skew shape yu/v, define the skew Schur polynomial for w/v by 
setting 
Syfv(@1,+.-,2n)= gt), 
TESSYTN (u/v) 


10.15. Example. Let us compute the Schur polynomials s,,(x%1, v2, #3) for all partitions of 
3. First, when ps = (3), we have the following semistandard tableaux of shape (3) using the 
alphabet {1, 2,3}: 


It follows that 
8(3)(@1, 2,03) = 2} + Uj xq + apws + 115 + 21 T2Qt3 + 0123 +234 2303 4+ LQx3 + 273. 


Second, when yp = (2,1), we obtain the following semistandard tableaux: 


So 8(2,1)(#1, 22,23) = xi{te + a{x3 + 3x3 +2123 + 2x1 XQx3 + 21x} + F923. Third, when 
w= (1,1,1), we see that s(1,1,1)(#1, £2,173) = £1223, since there is only one semistandard 
tableau in this case. 

Now consider what happens when we change the number of variables. Suppose first that 
we use N = 2 instead of N = 3. This means that the allowed alphabet for the tableaux has 
changed to {1,2}. Consulting the tableaux just computed, but disregarding those that use 
the letter 3, we conclude that 


Bac 2, ,3, ee: 2, _ 
8(3)(@1,%2) = @p taj r2+2195+2%); 8(2,1) (#1, ¥2) = X{L2+H1 29; $(1,1,1) (#1, £2) = 0. 


In these examples, note that we can obtain the polynomial s,,(a1,72) from s,,(#1, 72,73) 
by setting 73 = 0. More generally, we claim that for any w and any N > M, we can obtain 
S.(t1,.-.,¢m) from s,(%1,...,¢N) by setting the last N — M variables equal to zero. To 
verify this, consider the defining formula 


T 
Sy(%1,02,...,0N) = ) oP), 
TESSYT y(n) 


Upon setting ty41 =--: = ny = 0 in this formula, the terms coming from tableaux T that 
use letters larger than M will become zero. We are left with the sum over T € SSYT ys (w), 
which is precisely s,,(@1,%2,...,@a). 

Suppose instead that we increase the number of variables from N = 3 to N = 5. Here 
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we must draw new tableaux to find the new Schur polynomial. For instance, the tableaux 


for w = (1,1,1) are: 
fy ta) 2) 2 a) a) 2) 2) 2) 3 
12} [2] [2] {3} [3] [4] [3] [3] [4) [4] 
3} [4] 15] (4) 15) 15} 4) 1S) Le} I 


8(1,1,1) (41, £2; 23, £4; L5) = Ly LQe3 + © L9G4 + G1 Lot5 +--+ + U3U4U5. 


Accordingly, 


10.16. Example. A semistandard tableau of shape (1*) using the alphabet X = 
{1,2,...,N} is essentially a strictly increasing sequence of k elements of X, which can 
be identified with a k-element subset of X. Combining this remark with the definition of 
Schur polynomials, we conclude that 


81k) (Z1,---, tN) = S- Vi, Vig * Vix: (10.1) 


1<iy Sig <3 <ip SN 


Similarly, a semistandard tableau of shape (k) is a weakly increasing sequence of k elements 
of X, which can be identified with a k-element multiset using letters in X. So 


8(n)(X1,-..,2N) = Se Li, Vig? Liz. (10.2) 


1St1 Sta 50S tp. SN 


10.17. Example. Given any integer N > 4, what is the coefficient of 7?23x324 in the 
skew Schur polynomial s(4,3)/(1)(%1,--.,@Nn)? The answer is the number of semistandard 
tableaux of shape (4,3)/(1) using exactly two 1’s, two 2’s, one 3, and one 4. Equivalently, we 
seek the semistandard tableaux with content (2,2,1,1). The required tableaux are shown 


Te: 
Apaf2y fapaysy  fafai4} fa patay [a fay3y fafa] 4] 
213]4f (2]214J 22/3) stay pat 4y [1 2]3] 


So the desired coefficient is 6. Next, what is the coefficient of 21227327? Now we must find 
the tableaux of content (1,1,2,2), which are the following: 


Af2}3} _ faf2t4ay fasta} fasta} fa3i3)  fat3 4) 
3f4]4} [3]3]4] (24 4f4f 2f3i4y faf4atay 3} 4) 


Again there are six tableaux, so the coefficient of 21727327 is 6. Finally, what is the co- 
efficient of x?x2x327? Drawing the tableaux of content (2,1,1,2) produces the following 


fd taf2} _fafas} _fafatay —[afeta] fapatay [af 2t3) 
3t4]4J [2]4]4y (2]3y4f stay faf2t4ay 4] 4) 


The coefficient is 6 again! One may check that for any rearrangement of the vector 
(2,1,1,2,0,0,...), the number of semistandard tableaux of shape (4,3)/(1) having this 
content is always 6. This is not a coincidence; it is a consequence of the fact that Schur 
polynomials are symmetric, which we will prove shortly (§10.6). 


10.18. Remark. We have presented a combinatorial definition of Schur polynomials using 
semistandard tableaux. One can also define Schur polynomials algebraically as a quotient 
of two determinants; see 11.45. Alternatively, one can define Schur polynomials using de- 
terminants involving the elementary or homogeneous symmetric polynomials to be defined 
below; see 11.60 and 11.61. Many properties of Schur polynomials can be established either 
combinatorially or algebraically. In this text, we prefer to give the combinatorial proofs. 
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10.4 Symmetric Polynomials 


The examples of Schur polynomials computed in the last section were all symmetric; in 
other words, permuting the subscripts of the x-variables in any fashion did not change the 
answer. This section begins our examination of the general theory of symmetric polynomials. 
Throughout the discussion, we assume that K is a field containing Q (for instance, Q or 
R or C), and N is a fixed positive integer. We will be working in the polynomial ring 
R= K{a1,...,¢n] consisting of all polynomials in N variables with coefficients in K (see 
87.16). 

One property of polynomial rings such as R is that we can substitute arbitrary ring ele- 
ments for each of the variables x;. A formal statement of this “universal mapping property” 
of R was given in 7.102. Suppose that o € Sy is a given permutation of the subscripts of 
the x-variables. According to 7.102, there is a unique ring homomorphism E : R > R such 
that E(c) = c for alle € K and E(a;) = 4(;) for all i. For any polynomial f € R, we often 
denote E(f) by f(€o(1),---,%o(n)). Note, in particular, that f(v1,...,2n) = f. Informally, 
we compute E(f) by starting with a symbolic expression for f as a sum of products of 2;’s, 
and then replacing each symbol x; by x,(;). With this notation in hand, we can now give 
the formal definition of symmetric polynomials. 


10.19. Definition: Symmetric Polynomials. A polynomial f € K[21,...,2y] is sym- 
metric iff 
f (Beas, Lg (2)y+++ ,Lo(N)) = f(x, oot tN) for alla € Sy. 


In other words, any permutation of the variables x; leaves f unchanged. Since any 
permutation can be achieved by a finite sequence of basic transpositions (see 9.29), f is 
symmetric iff for every 1 < N, interchanging 7; and x;41 in f leaves f unchanged. 

We now introduce special names for some commonly used symmetric polynomials. 


10.20. Definition: Power Sums. For every k > 1, the polynomial 
Pr(@1,£2,...,£N) =o + x5 fee toh, 
is evidently symmetric. This polynomial is called the kth power-sum in N variables. 
For example, p3(x1, 72, %3,%4,U5) =a} +23+23+a}4 23. 
10.21. Definition: Elementary Symmetric Polynomials. For fixedk with 1 <k< JN, 
define the polynomial 
en (@1,%2,---,2N) = S- Uj, Vig * Li, 
1<i4 <tg<0+ ip N 


The polynomial ex is called an elementary symmetric polynomial in N variables. One may 
check that e, is indeed symmetric. We also set eo(11,...,uN) = 1 and ex(a1,...,un) = 0 
for alk > N. 


For example, e2(21,%2,%3,04) = @1%q + 4123 + 11"4 + Lov3 + Lox, + 1324. By for- 
mula (10.1), we see that e,(21,...,%N) = $(1*)(@1,---,%N), 80 that elementary symmetric 
polynomials are special cases of Schur polynomials. 


10.22. Definition: Complete Symmetric Polynomials. For fixed k > 1, define the 
polynomial 


hy (a1, 22,...,2N) = 5 Li, Vin + Li, 


1<i1 Sig S++ Stp SN 
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One may verify that h; really is symmetric. We also set ho(x1,...,¢N) = 1. The polynomials 
hy are called complete homogeneous symmetric polynomials in N variables. 


We call hy “complete” because it is the sum of all monomials of degree k in the given 


variables. For example, h2(x1, 72, %3) = v7 +273+234+2122+21%342273. By formula (10.2), 


we see that hy(171,...,2n) = 8(4)(€1,---,@N), So that complete symmetric polynomials are 
also special cases of Schur polynomials. 
The polynomial qg(a1,...,¢) = aor a eS is readily seen to be symmetric. This exam- 


ple can be generalized as follows. 


10.23. Definition: Monomial Symmetric acacia Let yz be an integer partition 
with at most N nonzero parts. Write py = (M1 > pg > +++ > pn) € NX by adding zero 
parts if necessary. For any a € NN, write c® = xf! x9? --- 2%. Let sort(a) € N% be the 
unique partition obtained by sorting the entries of a into weakly decreasing order. Next, 
let M(u) = {a € N* : sort(a) = p}. Finally, define the monomial symmetric polynomial 


indexed by 4s to be 
mp(t1,...,0n) = » L”. 
ae M (1) 


Informally, m,(21,-..,2n) is the sum of all distinct monomials «{' --- a’ whose ex- 
ponent vector can be rearranged to give pt. In this notation, the polynomial q above is 
m3,2)(%1,---,2n). Some of our previous examples are instances of monomial symmetric 
polynomials. Namely, we have px(21,...,@N) = Mg) (21,..-,eN) and ex(71,...,2N) = 
mk) (X1, see ,0N). 

Let us check that m, really is symmetric. Given o € Sy, we have 


My(te(ayy-+ +) “ee wet) tet = DL [lev - > TL a, 


ae M (p acM (p) t=1 ae M (pu 


The last step follows by setting 7 = o(i) and rearranging the order of factors in each product. 
To continue, introduce a new summation variable 3 = (a,-1(1),.--,Q-1(N)). The entries 
of @ are obtained by rearranging the entries of a, so sort(3) = sort(a@) = yw. In fact, the map 
at+ @ is a bijection of M() to itself with inverse 3 +> (8,(1),---,8,(w))- Since addition is 
commutative, we can continue the calculation by writing 


> Tl rer) = S- Te} =malensonas 


aeM(p BEM (pu 
10.24. Definition: Ay. Let Ay be the set of all symmetric polynomials in K[x,..., xn]. 


If two polynomials f and g are symmetric, so are f +g, —f, and fg. For example, 


(FoNeA cesta SS TF Crayese ste weedy sBetyy) 
fOince sen) ois Nn) 
= (fg)(x1,...,2N) (for all o € Sy). 


Also, any constant polynomial c € K is certainly symmetric, as is any scalar multiple cf 
of a symmetric polynomial f. These comments imply that Ay is a subring and K-vector 
subspace of K[a1,...,2y]. In particular, Ay is a commutative ring with identity and a 
vector space over kK. 

We have just seen that Ay is closed under products. So, we can multiply together poly- 
nomials of the form ex, hx, or py to obtain even more examples of symmetric polynomials. 
This leads to the following definition. 


I 
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10.25. Definition: The Symmetric Polynomials e,, ha, and py. Let a = (ay,...,@5) 
be any sequence of positive integers. Define 


s 

€a(41,--.,0N) = Dei Giycctms 
i=1 
s 

ha(@41,--.,0N) = [| pe (21,.--,2~); 
i=1 
s 

Pal(@1,---,%N) = [eaten en): 
i=l 


We call €g the elementary symmetric polynomial indexed by a; ha the complete homoge- 
neous symmetric polynomial indexed by a; and pa the power-sum symmetric polynomial 
indexed by a (in N variables). 


These definitions are most frequently used when a is an integer partition. Suppose the 
sequence a can be sorted to give the partition yw. Then €g = ey, ha = hy, and pa = 
Pu, because multiplication of polynomials is commutative. More generally, if a and ( are 
rearrangements of each other, then €g = eg, ha = hg, and po = pa. 


10.26. Remark. The power-sum polynomials p,, have already appeared in our discussion of 
Pélya’s Formula (§9.19), where they were used to count weighted colorings with symmetries 
taken into account. 


10.27. Remark. The polynomials eg and ha are special cases of skew Schur polynomials. 
For example, consider ha = he, ha, ++: ha,. We have seen that each factor ha, is the gen- 
erating function for semistandard tableaux of shape (a;). There exists a skew shape pu/v 


consisting of disconnected horizontal rows of lengths a1,...,a;. When building a semistan- 
dard tableau of this shape, each row can be filled with labels independently of the others. 
So the product rule for weighted sets shows that ha(%1,--.,%n) = 8,/)(%1,---,@n). For 


example, given hz 43,2) = hehah3h2 = hi4,3,2,2), we draw the skew shape 


Thus we have hi2.4.3.2) = §(11,9,5,2)/(9,5,2): An analogous procedure works for the e,’s, but 
now we use disconnected vertical columns of lengths given by a. For example, €(3.3.1) = Sy/v 
if we take 


p/Vv om = (3, 3, 3, 2, 2, 2, 1)/(2, 2, 2, 1, 1, Dy 


10.5 Homogeneous Symmetric Polynomials 


When studying symmetric polynomials, it is often helpful to focus attention on those poly- 
nomials that are homogeneous of a given degree. 
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10.28. Definition: Homogeneous Symmetric Polynomials. For all k, N € N, let AX, 

be the set of symmetric polynomials p € Ay such that p is homogeneous of degree k. This 

means that every monomial «* appearing in p with nonzero coefficient has degree k (i.e., 
N ; ate 

>3j_-, @ =k). In particular, the zero polynomial is homogeneous of every degree. 


One can check that for f,g € Ak, and c € K, we have f +g € Ak and cf € AX. This 
means that A‘, is a K-vector space. Furthermore, the K-vector space Ay is the direct sum 


of vector spaces 
k 
Aw = An, 
k>0 


since every symmetric polynomial can be uniquely written as a finite sum of its nonzero 
homogeneous components. Moreover, p € AK, and q € Avy imply pg € Ae , which means 
that this direct-sum decomposition turns Ay into a graded ring. 

The vector space Ay is infinite-dimensional, but each homogeneous piece A‘, is finite- 
dimensional. A key theme in the theory of symmetric polynomials is the problem of finding 
different bases of the vector space AX, and understanding the relations between these bases. 
We begin in this section by considering the most straightforward basis for this vector space, 
which consists of suitable monomial symmetric polynomials. 


10.29. Theorem: Monomial Basis of Ne For every k and N, the indexed set of 
polynomials 
{my(@1,...,@n): © Parn(k)} C Klai,...,2N] 


is a basis for the K-vector space A‘. 


Proof. For w € Pary(k), recall that m,, is the sum of all distinct monomials 7* such that 
a € N* can be rearranged to give ju. Each of these monomials has degree || = k, so that 
each m, in the given set is indeed symmetric and homogeneous of degree k. Next, let us 
prove that the m,.’s are linearly independent over K’. Suppose some linear combination of 
these polynomials is the zero polynomial, say 


Se Gptrigg (Mis 0s 23.0) =0 (cy € K). (10.3) 


Consider some fixed v € Pary(k). Given any partition 4 4 v, we cannot rearrange the parts 
of v to obtain py. It follows that m, is the only monomial symmetric polynomial in the sum 
in which x” appears with nonzero coefficient. The coefficient of x” in m, is 1. Extracting 
the coefficient of «” on both sides of (10.3) therefore gives c,-1 = 0. Since v was arbitrary, 
all c,’s are zero, completing the proof of linear independence. 

Next, let us prove that the m,’s span AX. Let f(a1,...,2n) be any homogeneous 
symmetric polynomial of degree k. For each pp € Pary(k), define d,, € K to be the coefficient 
of x in f. We claim that 


Sd geen) =f oinew) (10.4) 
be 


It suffices to check that, for every a € NX with |a| =k, the coefficient of x* on both sides 
of (10.4) is the same. Fix such an a, and note that there is a unique partition vy € Pary(k) 
such that sort(a) = v. Reasoning as before, we see that the coefficient of c® in )>d,m, 
must be d,. On the other hand, since f is symmetric, the coefficient of x* in f must be the 
same as the coefficient of x” in f, since some permutation of the variables will change x«° 
into «”. But the coefficient of 7” in f is d, by definition, so we are done. O 
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10.30. Remark. If yw is a partition of k with more than N nonzero parts, then 
My (1,-..,¢y) is not defined. If the number of variables N exceeds k, then the condition 
£(u) < N automatically holds for all partitions + k. Therefore, when N > k, the basis for 
AK reduces to {m,(a@1,-..,2N) :  € Par(k)}. So, when N > k, we have dim(A‘,) = p(k), 
the number of integer partitions of k. 


i 
10.6 Symmetry of Schur Polynomials 
Recall the definition of skew Schur polynomials from 10.14: 


Sa fr(€1,+.-,2N) = gee). 
TESSYT Nn (u/v) 


We are about to give a bijective proof that the polynomial appearing in this definition is 
always symmetric. First, we give names to the coefficients of these polynomials. 


10.31. Definition: Kostka Numbers. For each skew shape ju/v and each a € NN, define 
the Kostka number K,,jy,. to be the coefficient of x® in s,,/,(@1,...,¢n). Equivalently, 
K,/v,« is the number of semistandard tableaux of shape /v and content a. 


10.32. Example. The calculations in 10.17 show that 


K(4,3)/(1),(2,2,1,1) = £(4,3)/(),(1,1,2,2) = 1 (4,3)/(4),(2,1,1,2) = 9: 
Similarly, we see from 10.9 that K(3,2),(1,1,1,1,1) = 5. 
The following result is the key to proving the symmetry of Schur polynomials. 


10.33. Theorem: Symmetry of Kostka Numbers. For all skew shapes z/y and all 
a, 3 € NN such that sort(a) = sort(3), we have Ei iiie= TG ia 


Proof. Fix p/v and a, 6 as in the theorem statement. Since sort(a@) = sort(@), we can pass 
from a to 2 by a suitable permutation of the entries of a. This permutation can be achieved 
in finitely many steps by repeatedly interchanging two consecutive entries of a (cf. 9.29 
and 9.179). By induction, it therefore suffices to prove the result when ( is obtained from 
a by switching a; and a;41 for somei < N. 

Let Y be the set of all tableaux T € SSYTy(/v) such that c(T) = a, and let Z be the set 
of all tableaux T € SSYTy(y/v) such that c(T) = @. Since |Y| = Kyjvq and |Z| = Ky/jv,e, 
it suffices to define a bijection f; : Y — Z. The map f; must take a semistandard tableau of 
shape y/v and create a new semistandard tableau of the same shape in which the number 
of i’s and (i + 1)’s are switched, while the number of k’s (for all k #4 i,i +1) is unchanged. 
We will illustrate the action of f3 on the following tableau: 


[1 ]1 [1] 1/213] 
[21313] 3] 4] 4] 4| 
[1/3/13] 4[4/5]5[6] 


6/7171 8 | 


Observe that certain occurrences of 3 are “matched” with an occurrence of 4 in the cell 
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directly below. Let us underline the 3’s and 4’s that are not part of these matched pairs: 


Notice that each row of the tableau contains a (possibly empty) run of consecutive cells 
consisting of underlined 3’s and 4’s. The entries directly above these cells are < 3, while 
the entries directly below are > 4. So we are free to change the frequency of 3’s and 4’s 
within each run without affecting the semistandardness of the tableau. If the run in a given 
row consists of 7 threes followed by & fours (where j,k > 0), we will change this to a run 
consisting of & threes followed by 7 fours. Doing this in every row will switch the frequency 
of 3’s and 4’s (note that the matched pairs are not touched, and these contribute equally to 
the frequency counts for 3 and 4). Our example tableau is mapped by fs to the following 
tableau: 


Applying the same run-modification process to this new tableau will restore the original 
tableau; this means that fs is a bijection. As another example of the action of f3, we have 


fa 42h = BBB GIB ELIS) 


The definition of f; for general i is exactly the same. We locate and ignore matched pairs 
consisting of an i directly atop an i+ 1, then underline the remaining i’s and (i + 1)’s, 
then switch the relative frequencies of the underlined 7’s and (i + 1)’s in each row. This 
action is reversible, maintains semistandardness, and switches the overall frequency of 7’s 
and (i+1)’s while preserving the frequency of all other letters. So we have found the required 
bijection. O 


10.34. Example. The preceding proof allows us to construct explicit bijections between the 
collections of tableaux in 10.17, which are counted by various Kostka numbers K(4,3)/(1),a 
such that sort(a@) = (2,2,1,1). As directed by the proof, we must chain together suitable 
maps f;, where the values of 7 are chosen to rearrange the starting content vector a into the 
target content vector G. For example, we can go from (2,2,1,1) to (2,1,2,1) to (2,1,1, 2) 
by applying fo and then f3. So, for instance, the first tableau of content (2,2,1,1) in 10.17 
is mapped to a tableau of content (2,1, 1,2) as follows: 
f 


If we continue by applying the maps f; and then fz, we reach a tableau with content 


(1,152, 2): 


The inverse bijection is computed by applying the maps in the reverse order. For example, 
the first tableau of content (1,1,2,2) in 10.17 is mapped to a tableau of content (2, 2,1, 1) 
via the following steps. 
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We can now deduce the symmetry of skew Schur polynomials. In fact, we can even 
expand these polynomials as linear combinations of monomial symmetric polynomials using 
the Kostka numbers. 


10.35. Theorem: Monomial Expansion of Schur Polynomials. For all skew shapes 
p/v with k boxes and all N > 1, 


Syuful(C1y-<252N) = S- KE papiip Qy.o tN). (10.5) 
p€Parn (k) 
In particular, s,,/)(@1,...,¢y) is a homogeneous, symmetric polynomial of degree k. 


Proof. Consider the following calculation: 


Spf Wee) = SS Kiar S- S- Kj 


aeNN peParn(k) aeNN: 
sort(a)=p 

_ a _ a 
7 x. ye Kyjupt = Ne Kyju.p a zt 

peParn(k) aecN: p€Parn(k) acNy: 

sort(a)=p sort(a)=p 

= Kigali ties ces): 

p€Parn(k) 


The first step follows from the definition of Kostka numbers. In the second step, we reorder 
the sum by classifying a € N% based on which partition a sorts to. Only partitions of k 
occur, since 27) is a monomial of degree k for every tableau T on the k-box shape p/v. 
The third step follows from 10.33. The fourth step uses the fact that K,,/,,) does not depend 
on the inner summation index a. The final step follows by definition of mp. O 


DS 
10.7 Orderings on Partitions 


We will use 10.35 to find bases for the vector spaces AX, consisting of suitable Schur poly- 
nomials. First, however, we need to introduce some ordering relations on sets of integer 
partitions. 


10.36. Definition: Lexicographic Ordering of Partitions. Suppose p = (4; : 7 > 1) 
and v = (y%;,:7> 1) are partitions of the same integer k. We say that v is lexicographically 
greater than ys, written UW <iex Vv, iff either = v or the first nonzero entry in the vector v— yu 
is positive. The latter condition means that for some j, fy = 1, fg = V2, -- 5 Wj—-1 = Vj-1; 
and pj; < vj. 

It is routine to check that <jex is a total order on Par(k), for each k > 0. 


10.37. Example. Here is a list of all integer partitions of 6, written in lexicographic order 
from smallest to largest: 


(1, 1, 1, 1, 1, 1) Sex (2, 1, 1, 1, 1) Slex (2, 2, 1, 1) Slex (2, 2, 2) Slex (3, 1, 1, 1) 
<lex (3, 2, 1) <lex (3, 3) Sex (4, 1, 1) Sex (4, 2) Slex (5, 1) Slex (6). 


For example, (3,1, 1,1) <jex (3, 2,1) since 
494,00.) HG sii f=, 0 A109 


and the earliest nonzero entry in this vector is positive. 
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In the coming sections, we will frequently be considering matrices and vectors whose 
rows and columns are indexed by integer partitions. Unless otherwise specified, we will 
always use the lexicographic ordering of partitions to determine which partition labels each 
row and column of the matrix. For instance, when k = 3, a matrix A = (c,,, : pw, € Par(3)) 
will be displayed as follows: 


€(1,1,1),(1,1,1) ©(4,1,1),(2,1) ~— ©(1,1,1),(3) 
A= €(2,1),(1,1,1) €(2,1),(2,1) €(2,1),(3) 
€(3),(1,1,1) €(3),(2,1) €(3),(3) 


Next we consider a partial ordering on partitions that occurs frequently in the theory of 
symmetric polynomials. 


10.38. Definition: Dominance Ordering on Partitions. Let u = (ji: 7% > 1) and 
v = (4%; :%> 1) be two partitions of the same integer k. We say that v dominates , written 
dy, iff 

My t pote tps Sy +r24+---+y; foralli>1. 


Note that pf v iff there exists ani > 1 with wy +---+ py; > t+--- +. 


10.39. Example. We have (2,2, 1,1) <(4,2) since 2 < 4,24+2 <442,24+2+41<442+0, 
and 2+2+1+1<4+2+0+0. On the other hand, (3,1,1,1) @ (2, 2,2) since 3 > 2, and 
(2, 2,2) @ (3,1,1,1) since 2+2+2>3+1+1. This example shows that not every pair of 
partitions is comparable under the dominance relation. 


10.40. Theorem: Dominance Partial Order. The dominance relation is a partial or- 
dering on Par(k), for every k& > 0. 


Proof. We will show that <1 is reflexive, antisymmetric, and transitive on Par(k). Reflexivity: 
Given wt k, we have poy +++ + ps < pa +--+ + pi for alli > 1. So udp. Antisymmetry: 
Suppose p,yv F k, wdv, and vy dy. We know wy +---+ yu; < 4 +--- + and also 
Wyte +; < py +--+ py, for all 7, hence wy +--- +p; =, +---+Y; for alli > 1. In 
particular, taking i = 1 gives y, = 4. For each i > 1, subtracting the (i — 1)th equation 
from the ith equation shows that uw; = 4%. So w = v. Transitivity: Fix p,v,p  k, and 
assume pp J yv <p; we must prove yz <p. We know pw, +--- +p; < 4+---+1; for all 
i, and also yy +---+%< pi +---+ p; for all 7. Combining these inequalities yields 
fy tees + pi < pr +--+ +p; for all 7, so pw <p. O 


One can check that < is a total ordering of Par(k) iff k <5. 


10.41. Theorem: Lexicographic vs. Dominance Ordering. For all u,v k, wdv 
implies pt <jex V. 


Proof. Fix u,v such that fs Ziex Vv; we will prove that «4 # v. By definition of the lexico- 
graphic order, there must exist an index j > 1 such that yw; = 4 for all 7 < 7, but pj > v;. 
Adding these relations together, we see that f4+---+pj >41+:--+vj,andsonav. O 


The next definition and theorem will allow us to visualize the dominance relation in 
terms of partition diagrams. 


10.42. Definition: Raising Operation. Let u and v be two partitions of k. We say that v 
is related to ys by a raising operation, denoted Ry, iff there exist 1 < j such that y; = wi+1, 
Vj = fj — 1, and v, = ps for all s £i, j. 


Intuitively, wRv means that we can go from the diagram for yz to the diagram for v by 
taking the last square from some row of y and moving it to the end of a higher row. 
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10.43. Example. The following pictures illustrate a sequence of raising operations. 


«EEE PEP 


Observe that (4,3,3, 1,1) < (5,4, 2,1), so that the last partition in the sequence dominates 
the first one. The next oe shows that this always happens. 


10.44. Theorem: Dominance and Raising Operations. Given u,v ' k, we have uwdvy 
iff there exist m > 0 and partitions y°,..., u” such that p = wo Ru! Ry?--- pw" Rp” = 


Proof. Let us first show that Rv implies wv. Suppose v = (f41,..., Mi t1,...,j;—-1,...) 
as in the definition of dominance ordering. Let us check that wy +---+ up, <p +-++ + r% 
holds for all k > 1. This is true for k < 2, since equality holds for these k’s. If k = 7, note 
that vp +---+y, = py t+ +++ j—-1 + (4; +1), so the required inequality does hold. Similarly, 
for all k with i <k <j, we have 4, +--+: +p = py t-->t+uR+1 > wy +--+ + pg. Finally, 
for all k > j, we have wy +--+ + be = 41 +-+++ 1% since the +1 and —1 adjustments to 
parts 7 and 7 cancel out. 

Next, suppose yz and vy are linked by a chain of raising operations as in the theorem 
statement, say w = w° Rut Ry?---Ru™ = v. By what has just been proved, we have p = 
pw dpi dp?---dp™ =v. Since <I is transitive, we conclude that ps < v, as desired. 

Conversely, suppose that  <v. Consider the vector (di, d2,...) such that ds = (v1. + 

+> +5) — (ui +--+ + ps). Since wv, we have d, > 0 for all s. Also, d, = 0 for all large 
enough s since js and v are both partitions of k. We argue by induction on n = )°, ds. If 
n =0, then w = v, and we can take m = 0 and p = p? = v. Otherwise, let i be the least 
index such that d; > 0, and let 7 be the least index after 7 such that d; = 0. The choice of i 
shows that ws, =v, for all s < i, but pw; < y. If i > 1, the inequality uw; < yj; < YH-1 = wi-1 
shows that it is possible to add one box to the end of row 7 in dg(js) and still get a partition 
diagram. If 7 = 1, the addition of this box will certainly give a partition diagram. On the 
other hand, the relations dj;_; > 0, d; = 0 mean that yy + +--+ pj-1 <4 +++ +Y;~-1 but 
fy tees + py =U, +--+ +;, 80 that uw; > v;. Furthermore, from d; = 0 and dj41 > 0 we 
deduce that wj41 < vj41. So, wj41 < Vj41 < Vy < uj, which shows that we can remove a 
box from row j of dg() and still get a partition diagram. 

We have just shown that it is permissible to modify ys by a raising operator that moves 
the box at the end of row j to the end of row i. Let js! be the new partition obtained in this 
way, so that Ru. Consider how the partial sums 1 + --- + 4s change when we replace 
pw by p!. For s < i or s > j, the partial sums are the same for yu and p!. Fori < s < J, 
the partial sums increase by 1. Since d; > 0 in the range i < s < j, it follows that the 
new differences d/, = (vy + +++ + vs) — (ut +++: +p) are all > 0; in other words, yt dv. 
We have d, = d, — 1 for i < s < j, and di = d, for all other s; so }>d, < Sods. Arguing 
by induction, we can find a chain of raising operations linking p14 to v. This completes the 
inductive proof. oO 


As an application of the previous result, we prove the following fact relating the domi- 
nance ordering to the conjugation operation on partitions. 


10.45. Theorem: Dominance vs. Conjugation. For all u,v € Par(k), wv iff’ dp’. 


Proof. Fix u,v € Par(k). Note first that Rv implies v’ Ru’. This assertion follows from the 
pictorial description of the raising operation, since the box that moves from a lower row 
in pw to a higher row in v necessarily moves from some column to a column strictly to its 
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right. Reversing the direction of motion and transposing the diagrams, we see that we can 
go from v’ to pz’ by moving a box in v’ from a lower row to a higher row. 
Next, assuming p Jv, 10.44 shows that there is a chain 


p= poRp Ry? pp? Ry” = v. 
Applying the remark in the previous paragraph to each link in this chain gives a new chain 
Y= (pry Ru TYR (wR) Rw)! = pl. 
Invoking 10.44 again, we see that v’ <p’. 


Conversely, assume that v’ dy’. Applying the result just proved, we get py’ dv’. Since 


pe” =p and vy” = v, we have pw dv. oO 


DS 


10.8 Schur Bases 


We now have all the necessary tools to find bases for the vector spaces Ak, consisting of 
Schur polynomials. First we illustrate the key ideas with an example. 


10.46. Example. In 10.15, we computed the Schur polynomials s,,(2%1, x2, x3) for all parti- 
tions 4 € Par(3). We can use 10.35 to write these Schur polynomials as linear combinations 
of monomial symmetric polynomials, where the coefficients are Kostka numbers: 


8(1,1,1) (#1, £2, £3) = ™(1,1,1) (#1, £2, £3); 
§(2,1)(@1,%2,%3) = 2m1,1,1)(@1, 22, %3) + M2,1) (21, V2, £3); 
8(3)(@1, 2,23) = ™a,1,1)(#1, £2, 13) + M2,1) (1, £2, 3) + M3) (L1, 2, £3). 


These equations can be combined to give the following matrix identity: 


§(1,1,1) 1 0 0 ™aa,1,1) 
821 |=] 2 1 0 ™(2,1) 
§(3) 1 1 1 ™(3) 


The 3 x 3 matrix appearing here is lower-triangular with ones on the main diagonal, hence 
is invertible. Multiplying by the inverse matrix, we find that 


™(1,1,1) iF 0 0 §(1,1,1) 
™m2,1) =};-2 1 O 8(2,1) 
m3) 1 -1 1 $(3) 


This says that each monomial symmetric polynomial m, (#1, #2, 73) is expressible as a linear 
combination of the Schur polynomials s,,(z1, 72,23). Since the m,’s form a basis of the 
vector space A3, the Schur polynomials must span this space. Since dim(A3) = p(3) = 3, 
the three-element set {s,,(%1, 22,73) : w F 3} is in fact a basis of A3. 


The argument given in the example extends to the general case. The key fact is that the 
transition matrix from Schur polynomials to monomial symmetric polynomials is always 
lower-triangular with ones on the main diagonal, as shown next. 


10.47. Theorem: Lower Unitriangularity of the Kostka Matrix. For all partitions 
A, Ky,, = 1. For all partitions A and ps, Ky, 4 0 implies ~<A (and also uw <jex A, by 10.41). 
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Proof. The Kostka number kK, is the number of semistandard tableaux T’ of shape \ and 
content A. Such a tableau must contain A; copies of 7 for each i > 1. In particular, T contains 
A; ones. Since T is semistandard, all these ones must occur in the top row, which has A, 
boxes. So the top row of T contains all ones. For the same reason, the Az twos in T must all 
occur in the second row, which has 2 boxes. Arguing similarly, we see that T must be the 
tableau whose ith row contains all 2’s, for i > 1. Thus, there is exactly one semistandard 
tableaux of shape \ and content \. For example, when A = (4, 2,2,1), we must have 


For the second part of the theorem, we argue by contradiction. Assume X, u € Par(k) 
are such that K,, 4 0 and yet 4 @ A. Since the Kostka number is nonzero, there exists a 
semistandard tableau T of shape A and content py. Since the content of T is y, the entries 
of T come from the alphabet {1,2,..., @(44)}. Since the columns of T must strictly increase, 
we observe that all 1’s in JT must occur in row 1; all 2’s in T must occur in row 1 or row 
2; and, in general, all 7’s in T must occur in the top 7 rows of dg(A). Now, the assumption 
uw # A means that there is ani > 1 with wy +---+ uy > A, +---+A;. The left side of 
this inequality is the total number of occurrences of the symbols 1,2,...,27 in 7. The right 
side of the inequality is the total number of boxes in the top i rows of J. Our preceding 
observation now produces the desired contradiction, since there is not enough room in the 
top 7 rows of dg(A) to accommodate the 4 +--+-+ j4; occurrences of the symbols 1, 2,...,i 
in T. O 


10.48. Example. Let \ = (3,2,2) and pp = (2,2,2,1). The Kostka number Kk), is 3, as 
we see by listing the semistandard tableaux of shape \ and content p: 


In each tableau, all occurrences of i appear in the top 7 rows, and we do have pu JX. 


10.49. Theorem: Schur Basis of A‘. For all k, N € N, the set of Schur polynomials 
{s,(@1,...,¢n): A € Parn(k)} C K[ax1,...,2N] 
is a basis of the K-vector space AX. 


Proof. Let p = |Pary(k)|, and let S be the p x 1 column vector consisting of the Schur 
polynomials {s)(21,...,ay) : \ € Pary(k)}, arranged in lexicographic order. Let M be the 
p x 1 column vector consisting of the monomial symmetric polynomials {m,(v1,...,vN) : 
pu € Pary(k)}, also arranged in lexicographic order. Finally, let K be the p x p matrix, with 
rows and columns indexed by elements of Pary(k) in lexicographic order, such that the 
entry in row A and column yp is the Kostka number K),,,. Now 10.35 says that, for every 
d € Parn(k), 


$\(@1,---,2N) = SS Ky pMp(@1,...,2N). 
ue Parn(k) 
These scalar equations are equivalent to the matrix-vector equation S = KM. More- 


over, 10.47 asserts that K is a lower-triangular matrix of integers with 1’s on the main 
diagonal. So K has an inverse matrix (whose entries are also integers, since det(K) = 1). 
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Multiplying on the left by this inverse matrix, we get M = K~'S. This equation means 
that every m,, is a linear combination of Schur polynomials. Since the m,,’s generate ee 
the Schur polynomials must also generate this space. Linear independence follows automat- 
ically since the number of Schur polynomials in the proposed basis (namely p) equals the 
dimension of the vector space, by 10.29. O 


10.50. Remark. The matrix K occurring in this proof is called a Kostka matrix. The entries 
of the inverse Kostka matric K~! tell us how to expand monomial symmetric polynomials 
in terms of Schur polynomials. As seen in the 3 x 3 example, these entries are integers 
which can be negative. It is natural to ask for a combinatorial interpretation for these 
matrix entries in terms of suitable collections of signed objects. One such interpretation 
will be discussed in §11.15 below. 


10.51. Remark. If \ € Par(k) has more than N parts, then s)(%1,...,2nN) = 0. This 
follows since there are not enough letters available in the alphabet to fill the first column 
of dg(A) with a strictly increasing sequence. So there are no semistandard tableaux of this 
shape on this alphabet. 


DT 


10.9 Tableau Insertion 


We have seen that the Kostka numbers give the coefficients of the monomial expansion 
of Schur polynomials. Remarkably, the Kostka numbers also relate Schur polynomials to 
the elementary and complete homogeneous symmetric polynomials. This fact will be a 
consequence of the Pieri rules, which tell us how to rewrite products of the form s,e, and 
S,hx as linear combinations of Schur polynomials. 

To develop these results, we need a fundamental combinatorial construction on tableaux 
called tableau insertion. Given a semistandard tableau T of shape yu and a letter x, we 
wish to build a new semistandard tableau by “inserting x into T.” The following recursive 
procedure allows us to do this. 


10.52. Definition: Tableau Insertion Algorithm. Let T be a semistandard tableau of 
straight shape p over the ordered alphabet X, and let x € X. We define a new tableau, 
denoted T <— a, by the following procedure. 


1. If ~w = 0, so that T is the empty tableau, then T < z is the tableau of shape 
(1) whose sole entry is x. 


2. Otherwise, let yy < yo <--- < y, be the entries in the top row of T. 


2a. If y, < x, then T — x is the tableau of shape (~1 +1, pe,...) obtained by 
placing a new box containing x at the right end of the top row of T. 

2b. Otherwise, choose the minimal 7 € {1,2,...,¢} such that x < y;. Let T’ be 
the semistandard tableaux consisting of all rows of T after the first one. 
To form T <— 2, first replace y; by x in the top row of T. Then replace T’ 
by T’ — y;, which is computed recursively by the same algorithm. 


If step 2b occurs, we say that x has bumped y; out of row 1. In turn, y; may bump an 
element from row 2 to row 3, and so on. 


This recursively defined insertion algorithm always terminates, since the number of times 
we execute step 2b is at most ¢(j:), which is finite. We must also prove that the algorithm 
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always produces a tableau that is semistandard and of partition shape. We will prove these 
facts after considering some examples. 


10.53. Example. Let us compute T <— 3, where 


We scan the top row of T from left to right, looking for the first entry strictly larger than 
3. This entry is the 4 in the fifth box. In step 2b, the 3 bumps the 4 into the second row. 
The current situation looks like this: 


Now we scan the second row from left to right, looking for the first entry strictly larger 
than 4. It is the 5, so the 4 bumps the 5 into the third row: 


Now, everything in the fourth row is weakly smaller than 7. So, as directed by step 2a, we 
insert 7 at the end of this row. The final tableau is therefore 


We have underlined the entries of T — 3 that were affected by the insertion process. These 
entries are the starting value x = 3 together with those entries that got bumped during the 
insertion. Call these entries the bumping sequence; in this example, the bumping sequence is 
(3, 4,5, 7). The sequence of boxes occupied by the bumping sequence is called the bumping 
path. The lowest box in the bumping path is called the new boz. It is the only box in T — 3 
that was not present in the original diagram for T. 

For a simpler example of tableau insertion, note that 
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To prove that T <— zx is always semistandard of partition shape, we need the following 
result. 


10.54. Theorem: Bumping Sequence and Bumping Path. Given a semistan- 
dard tableau T and element x, let (#1,%2,...,2%) be the bumping sequence and let 
((1, j1), (2, j2),---, (kK, jx)) be the bumping path arising in the computation of T — x. Then 
U= 4 < ao < +++ < ap and f1 > jg > ++: > je > 0. (So the bumping sequence strictly 
increases and the bumping path moves weakly left as it goes down.) 


Proof. By definition of the bumping sequence, x = x; and x; bumps x;41 from row 7 into 
rowi+1, for 1 <i<k. By definition of bumping, x; bumps an entry strictly larger than 
itself, so x; < x;41 for all i < k. Next, consider what happens to x;4, when it is bumped 
out of row 1. Before being bumped, «x;41 occupied the cell (i, 7;). After being bumped, x;441 
will occupy the cell (¢ + 1, 9:41), which is either an existing cell in row i+ 1 of T, or a new 
cell at the end of this row. Consider the cell (¢ + 1, #;) directly below (i, 7;). If this cell is 
outside the shape of T, the previous observation shows that (¢ + 1, 7:41) must be located 
weakly left of this cell, so that ji41 < j;. On the other hand, if (i+ 1, 7;) is part of T and 
contains some value z, then x;41 < z because T is semistandard. Now, x;4; bumps the 
leftmost entry in row 2+ 1 that is strictly larger than x;41. Since z is such an entry, 241 
bumps z or some entry to the left of z. In either case, we again have 7,41 < jj. O 


10.55. Theorem: Output of a Tableau Insertion. If T is a semistandard tableau of 
shape yp, then T <— « is a semistandard tableau whose shape is a partition obtained by 
adding one new box to dg(y). 


Proof. Let us first show that the shape of T < « is a partition diagram. This shape is 
obtained from dg() by adding one new box (the last box in the bumping path). If this new 
box is in the top row, then the resulting shape is certainly a partition diagram (namely, 
dg((41 + 1, 42,...))). Suppose the new box is in row i > 1. Then 10.54 shows that the new 
box is located weakly left of a box in the previous row that belongs to dg(j:). This implies 
that 4; < &j-1, so adding the new box to row 7 will still give a partition diagram. 

Next we prove that each time an entry of T’ is bumped during the insertion of x, the 
resulting tableau is still semistandard. Suppose, at some stage in the insertion process, that 
an element y bumps z out of the following configuration: 


a 
DG 


(Some of the boxes containing a, b,c, d may be absent, in which case the following argument 
should be modified appropriately.) The original configuration is part of a semistandard 
tableau, sob < z<canda< z<d. Because y bumps z, z must be the first entry strictly 
larger than y in its row. This means that b < y < z < c, so replacing z by y will still 
leave a weakly increasing row. Does the column containing z still strictly increase after the 
bumping? On one hand, y < d, since y < z < d. On the other hand, if the box containing a 
exists (i.e., if z is below the top row), then y was the element bumped out of a’s row. Since 
the bumping path moves weakly left, the original location of y must have been weakly right 
of z in the row above z. If y was directly above z, then a must have bumped y, and so a < y 
by definition of bumping. Otherwise, y was located strictly to the right of a before y was 
bumped, so a < y. We cannot have a = y in this situation, since otherwise a (or something 
to its left) would have been bumped instead of y. Thus, a < y in all cases. 

Finally, consider what happens at the end of the insertion process, when an element w 
is inserted in a new box at the end of a (possibly empty) row. This only happens when w 
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weakly exceeds all entries in its row, so the row containing w is weakly increasing. There is 
no cell below w in this case. Repeating the argument at the end of the last paragraph, we 
see that w is strictly greater than the entry directly above it (if any). This completes the 
proof that T — x is semistandard. O 


DT 


10.10 Reverse Insertion 


Given the output JT <— « of a tableau insertion operation, it is generally not possible to 
determine what T and x were. However, if we also know the location of the new box created 
by this insertion, then we can recover T’ and x. More generally, we can start with any 
semistandard tableau S' and any “corner box” of S, and “uninsert” the value in this box to 
obtain a semistandard tableau T and value x such that S = T — «x. (Here we do not assume 
in advance that S has the form T <— 2.) This process is called reverse tableau insertion. 
Before giving the general definition, we consider some examples. 


10.56. Example. Consider the following semistandard tableau: 


There are three corner boxes whose removal from S will still leave a partition diagram; they 
are the boxes at the end of the first, third, and sixth rows. Removing the corner box in the 
top row, we evidently will have S = T, — 4, where 


Suppose instead that we remove the 6 at the end of the third row of S. Reversing the 
bumping process, we see that 6 must have been bumped into the third row from the second 
row. What element bumped it? In this case, it is the 5 in the second row. In turn, the 5 
must have originally resided in the first row, before being bumped into the second row by 
the 4. In summary, we have S = T) ~— 4, where 


(Here we have underlined the entries in the reverse bumping sequence, which occupy boxes 
in the reverse bumping path.) Finally, consider what happens when we uninsert the 8 at 
the end of the last row of S. The 8 was bumped to its current location by one of the 6’s in 
the previous row; it must have been bumped by the rightmost 6, lest semistandardness be 
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violated. Next, the 6 was bumped by the 5 in row 4; the 5 was bumped by the rightmost 4 in 
row 3; and so on. In general, to determine which element in row i bumped some value z into 
row 7+ 1, we look for the rightmost entry in row 7 that is strictly less than z. Continuing 
in this way, we discover that S = T3 — 2, where 


With these examples in hand, we are ready to give the general definition of reverse 
tableau insertion. 


10.57. Definition: Reverse Tableau Insertion. Suppose S is a semistandard tableau 
of shape v. A corner box of v is a box (i, 7) € dg(v) such that dg(v) ~ {(7,7)} is still the 
diagram of some partition . Given S and a corner box (i, 7) of v, we define a tableau T 
and a value x as follows. We will construct a reverse bumping sequence (Xj, 2j-1,---,21) 
and a reverse bumping path ((i, ji), (¢ — 1, j:-1),---; (1, 1)) as follows. 


1. Set 7; = 7 and x; = S((i,7)), which is the value of S in the given corner box. 


2. Once x, and 3, have been found, for some i > k > 1, scan row k — 1 of S for 
the rightmost entry that is strictly less than x;,. Define x,_1 to be this entry, 
and let jx~-1 be the column in which this entry occurs. 


3. At the end, let x = 21, and let T be the tableau obtained by erasing box (i, j;) 
from S$ and replacing the contents of box (k — 1, jr—-1) by x, fori > k > 1. 


The next results will show that reverse insertion really is the two-sided inverse of ordinary 
insertion (given knowledge of the location of the new box). 


10.58. Theorem: Properties of Reverse Insertion. Suppose we perform reverse 
tableau insertion on S and (i,j) to obtain T and x as in 10.57. (a) The reverse bump- 
ing sequence satisfies x; > xj;_1 > ++: > #, = x. (b) The reverse bumping path satisfies 
di S Jina < +++ < ji. (c) T is a semistandard tableau of shape pw. (d) (I — 2) =S. 


Proof. Part (a) follows from the definition of x,-1 in 10.57. Note that there does exist an 
entry in row k—1 strictly less than xx, since the entry directly above xx (in cell (k—1, jx) of 
S) is such an entry. This observation also shows that the rightmost entry strictly less than 
x, in row k — 1 occurs in column j;, or later, proving (b). Part (c) follows from (a) and (b) 
by an argument similar to that given in 10.55; we let the reader fill in the details. For part 
(d), consider the bumping sequence (24, 75,...) and bumping path ((1, 71), (2, 94),...) for 
the forward insertion T — x. We have x, = x = x, by definition. Recall that +, = S((1,j1)) 
was the rightmost entry in row 1 of S that was strictly less than x2, and T((1,71)) = x2 by 
definition of T. All other entries in row 1 are the same in S and T. So T((1, 71)) = x2 will 
be the leftmost entry of row 1 of T strictly larger than 21. So, in the insertion T <— 2, x1 
bumps 22 out of cell (1, 71). In particular, 7; = j, and v5 = x. Repeating this argument in 
each successive row, we see by induction that 2), = x, and j;, = jr for all k. At the end of 
the insertion, we have recovered the starting tableau S. O 


10.59. Theorem: Reversing Insertion. Suppose S = (T <— «) for some semistandard 
tableau T and value x. Let (7,7) be the new box created by this insertion. If we perform 
reverse insertion on S$ starting with box (7,7), we will obtain the original T and 2. 
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Proof. This can be proved by induction, showing step by step that the forward and reverse 
bumping paths and bumping sequences are the same. The argument is similar to part (d) 
of 10.58, so we leave it as an exercise for the reader. O 


The next theorem summarizes the results of the last two sections. 


10.60. Theorem: Invertibility of Tableau Insertion. Let X be an ordered set, and let 
u be a fixed partition. Let P(:) be the set of all partitions that can be obtained from pz by 
adding a single box at the end of some row. There exist mutually inverse bijections 


I: SSYTx(u)x X—> |) SSYTx@), RB: UL SSYTx(v) > SSYTx(u) x X 
ve P(p) vEeP(p) 


given by I(T,x) = T — x and R(S) = the result of applying reverse tableau insertion to S 
starting at the unique box of S not in p. 


Proof. We have seen that J and R are well-defined functions mapping into the stated 
codomains. We see from 10.58(d) that Io R is the identity map on U,<py,,) SSYT x(v), 
while 10.59 says that Ro TJ is the identity map on SSYTx(#) x X. Hence J and R are 


bijections. O 

Let us take X = {1,2,...,N} in 10.60. We can regard X as a weighted set with 
wt(i) = a;. The generating function for this weighted set is 7] + 42 +--+ an = 
hi(21,.-.,@n) = $()(%1,...,2N) = €1(%1,...,¢n). Note that the content monomial 


g(T—3) ig rl) gz, since T’ <— 7 contains all the entries of T together with one new entry 
equal to 7. This means that wt(I(T, j)) = wt(T) wt(j), so that the bijection J in the theorem 
is weight-preserving. Using the product rule for weighted sets and the definition of Schur 
polynomials, the generating function for the domain of I is s,,(v1,...,¢n)hi(@1,...,@N). 
Using the sum rule for weighted sets, the generating function for the codomain of I is 
Se P(u) $)(@1,...,2n). To summarize, our tableau insertion algorithms have furnished a 
combinatorial proof of the following multiplication rule: 


Suhy = Sp1 = $y S(1) ) Sy, 


where we sum over all partitions v obtained by adding one corner box to py. We have 
discovered the simplest instance of the Pieri rules mentioned at the beginning of §10.9. 


10.11 Bumping Comparison Theorem 


We now extend the analysis of the previous section to prove the general Pieri rules for 
expanding s,h, and s,ex in terms of Schur polynomials. The key idea is to see what 
happens when we successively insert k weakly increasing numbers (or k strictly decreasing 
numbers) into a semistandard tableau by repeated tableau insertion. We begin with some 
examples to build intuition. 


10.61. Example. Consider the semistandard tableau 


[11 ]2/3/4] 
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Let us compute the tableaux that result by successively inserting 2, 3,3,5 into T: 


[1 1]2]2/3] 
PABIBIEIES 
[314] 416) 


T, = 


Consider the skew shape consisting of the four cells in 74 that are not in 7, which are 
marked by asterisks in the following picture: 


Observe that this skew shape is a horizontal strip of size 4. Next, compare the bumping 
paths in the successive insertions of 2,3,3,5. We see that each path lies strictly right of the 
previous bumping path and ends with a new box in a weakly higher row. 

Now return to the original tableau T’, and consider the insertion of a strictly decreasing 
sequence 5,4, 2,1. We obtain the following tableaux: 


PARRPAEIEARS 


This time, each successive bumping path is weakly left of the previous one and ends in a 
strictly lower row. Accordingly, the new boxes in $4 form a vertical strip: 


We now show that the observations in this example hold in general. 


10.62. Bumping Comparison Theorem. Let T be a semistandard tableau using letters 
in X, and let x,y € X. Let the new box in T < a be (i, 7), and let the new box in 
(T—2x)< y be (r,s). (a)u<yiffi>randj <s; (b)a>yiffi<randj>s. 
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Proof. It suffices to prove the forward implications, since exactly one of « < y or x > y 
is true. Let the bumping path for the insertion of x be ((1, 71), (2, j2),.--, (4, 9:)) (where 
ji = j), and let the bumping sequence be (# = x1, X2,...,x;). Let the bumping path for the 
insertion of y be ((1, 51), (2, s2),..., (7, Sr)) (where s, = s), and let the bumping sequence 
be (y = 41, Y2,---sYr)- 

Assume x < y. We prove the following statement by induction: for all k with 1 <k <r, 
we have i > k and xy, < yp and jp < sp. When k = 1, we have i > 1 and x, < y, (by 
assumption). Note that 2; appears in box (1, 71) of T — a. We cannot have s; < jj, for this 
would mean that y,; bumps an entry weakly left of (1, 71), and this entry is at most 21 < y1, 
contrary to the definition of bumping. So 7; < s,;. Now consider the induction step. Suppose 
k <r and the induction hypothesis holds for k; does it hold for k+1? Since k < r, yx must 
have bumped something from position (k, s,) into the next row. Since jp < 8%, Z% must also 
have bumped something out of row k, proving that 7 > k +1. The object bumped by zx, 
namely r,41, appears to the left of the object bumped by yz, namely yx+1, in the same row 
of a semistandard tableau. Therefore, 7441 < yx+1. Now we can repeat the argument used 
for the first row to see that jxii < 5%41. Now that the induction is complete, take k = r 
to see that 1 > r and j = j; < j, < 5, = s (the first inequality holding since the bumping 
path for T — x moves weakly left as we go down). 

Next, assume x > y. This time we prove the following by induction: for all k with 
1<k<i, we haver >k and zy > yz, and jp, > sy. When k = 1, we have x; =a >y=y. 
Since x appears somewhere in the first row of T — x, y will necessarily bump something 
into the second row, so r > 1. In fact, the thing bumped by y occurs weakly left of the 
position (1,71) occupied by x, so s, < ji. For the induction step, assume the induction 
hypothesis is known for some k < i, and try to prove it fork +1. Since k <iandk <r, 
both 2, and yz must bump elements out of row k into row k+1. The element y,4 1 bumped 
by yx occurs in column s;, which is weakly left of the cell (k, j;,) occupied by x, in T — a. 
Therefore, yx41 < @,, which is in turn strictly less than x,41, the original occupant of cell 
(k, jx) in T. So xe41 > Yrri. Repeating the argument used in the first row for row k + 1, 
we now see that y,41 must bump something in row k+1 into row k+2 (so that r > k+1), 
and sx41 < jr+i- This completes the induction. Taking k = 2, we finally conclude that r > 7 
and 7 = 9; > 8; > Sp = 8. O 


DS 


10.12 Pieri Rules 


Iteration of the bumping comparison theorem proves the following result. 


10.63. Theorem: Inserting a Monotone Sequence into a Tableau. Let T be a 
semistandard tableau of shape yz, and let S be the semistandard tableau obtained from T 
by insertion of 21, z2,..., 2% (in this order); we write S = (T — 2122--+ zx) in this situation. 
Let v be the shape of S. 

(a) If 21 < 22 <-+-++ < zg, then v/p is a horizontal strip of size k. 

(b) If 21 > 22 >--- > zp, then v/p is a vertical strip of size k. 


Since tableau insertion is reversible given the location of the new box, we can also reverse 
the insertion of a monotone sequence, in the following sense. 


10.64. Theorem: Reverse Insertion of a Monotone Sequence. Suppose yz and v are 
given partitions, and S is any semistandard tableau of shape v. 
(a) If v/ is a horizontal strip of size k, then there exists a unique sequence 21 < z2 <---< 
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z, and a unique semistandard tableau T of shape yu such that S = (T <— 2122--+ zp). 
(b) If v/p is a vertical strip of size k, then there exists a unique sequence z, > z2 > ++: > 2% 
and a unique semistandard tableau T of shape pu such that S = (T <— 2,22--+ zp). 


Proof. To prove the existence of T and the z;’s in part (a), we repeatedly perform reverse 
tableau insertion, erasing each cell in the horizontal strip v/u from right to left. This pro- 
duces a sequence of elements z,,..., 22,2, and a semistandard tableau T of shape w such 
that (T — 2129---2,) = S. Keeping in mind the relative locations of the new boxes created 
by z; and z;41, we see from the bumping comparison theorem that 2; < z;+1 for all 2. 

As for uniqueness, suppose T” and zi < 25 <--- < z, also satisfy S = (T’ <— z)z4--+z;,). 
Since 24 < z4 <---< 21, the bumping comparison theorem shows that the insertion of the 
zi’s creates the new boxes of v/y in order from left to right, just as the insertion of the z;’s 
does. Write Tp = T, T; = (T — 2129--+%), T§ = T’, and T/ = (T’ <— z4z4---z/). Since 
reverse tableau insertion produces a unique answer given the location of the new box, one 
now sees by reverse induction on i that T; = T/ and z; = zi fork >i > 0. 

Part (b) is proved similarly. O 


10.65. Theorem: Pieri Rules. Given an integer partition and positive integer k, let 
Hy() consist of all partitions v such that v/p is a horizontal strip of size k, and let Vi (1) 
consist of all partitions v such that v/j is a vertical strip of size k. For every ordered set 
X, there are weight-preserving bijections 


F : SSYTx(u) x SSYTx((k)) > [J SSYTx(); 
ve Hx (ph) 


G: SSYTx(u) x SSYTx((1*)) + [LJ SSYTx(v). 
VvEVEe(H) 


Consequently, we have the Pieri rules in An: 


Syhk = y Sy} Sy,ek = y Six 


ve A, (p) vEVE(H) 


Proof. Recall that a semistandard tableau of shape (&) can be identified with a weakly in- 
creasing sequence 21 < zg <--- < zp of elements of X. So, we can define F(T, 2122... 2%) = 
(T — 222:+-2%). By 10.63, F does map into the stated codomain. Then 10.64 shows 
that F is a bijection. Moreover, F' is weight-preserving, since the content monomial of 
(T — 2122+++ Zp) is rly, +e 

Similarly, a semistandard tableau of shape (1*) can be identified with a strictly increasing 
sequence yi < yo <-+- < yx. Reversing this gives a strictly decreasing sequence. So we 
define G(T, y1y2--- ye) = (LT — yr--+yoyi). As above, 10.63 and 10.64 show that G is a 
well-defined bijection. 

Finally, the Pieri rules follow by passing from weighted sets to generating functions, 
keeping in mind the sum and product rules for weighted sets, and using hy = $(,) and 
Ck = 5(1%)- O 


10.66. Example. We have 
8(4,3,1)h2 = 8(6,3,1) + $(5,4,1) + 8(5,3,2) + $(5,3,1,1) + 8(4,4,2) + $(4,4,1,1) + $(4,3,3) + $(4,3,2,1); 
as we see by drawing the following pictures: 


ESRaE 
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Similarly, we find that 


§(2,2)€3 = §(2,2,1,1,1) + $(3,2,1,1) + $(3,3,1) 


by adding vertical strips to dg((2,2)) as shown here: 


10.13 Schur Expansion of ha 


Iteration of the Pieri rules lets us compute the Schur expansions of products of the form 
Splasag*** has, OF Sp€a1€az *** Cas, Or even “mixed products” involving both h’s and e’s. 
Taking = 0, so that s,, = 1, we obtain in particular the expansions of hg and e, into sums 
of Schur polynomials. As we will see, examination of these expansions will lead to another 
occurrence of the Kostka matrix (cf. 10.50). 


10.67. Example. Let us use the Pieri rule to find the Schur expansion of h(2,1,3) = hehih3. 
To start, recall that hz = s(2). Adding one box to dg((2)) in all possible ways gives 


hah, = §(3) + §(2,1)- 
Now we add a horizontal strip of size 3 in all possible ways to get 


hohyh3 = 8(3)h3 + 8(2,1)h3 
[5(6) + 8(5,1) + 84,2) + $(3,3)] + [8(5,1) + $(4,2) + $(4,1,1) + $(3,2,1)] 
= $8) + 25,1) + 25(4,2) + $(4,1,1) + $(3,3) + $(3,2,1)- 


Observe that the Schur polynomials 5/51) and $4.2) each occurred twice in the final ex- 
pansion. Now, consider the computation of h(2.3.1) = heh3hi. Since multiplication of poly- 
nomials is commutative, this symmetric polynomial must be the same as h2,1,3). But the 
computations with the Pieri rule involve different intermediate objects. We initially calculate 


heh3 = 8(2)h3 = 8(5) + $(4,1) + 8(3,2)- 
Continuing by multiplying by h, gives 


hoh3ghy = ss5yhi + 8(a1yhi + 8(3,2)h1 
[s(6) + 8(5,1)] + [8(5,1) + $(4,2) + $(4,1,1)] + [S(4,2) + (3,3) + 5(3,2,1)], 


which is the same as the previous answer after collecting terms. As an exercise, the reader 
is invited to compute h(3 2.1) = hsh2h, and verify that the final answer is again the same. 
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10.68. Example. We have seen that a given Schur polynomial may appear several times 
in the Schur expansion of hy. Is there some way to find the coefficient of a particular Schur 
polynomial in this expansion, without writing down all the shapes generated by iteration 
of the Pieri rule? To answer this question, consider the problem of finding the coefficient 
of 8(5,4,3) when hi4,3,3,2) is expanded into a sum of Schur polynomials. Consider the shapes 
that appear when we repeatedly use the Pieri rule on the product hahsh3ho. Initially, we 
have a single shape (4) corresponding to h4. Next, we add a horizontal strip of size 3 in all 
possible ways. Then we add another horizontal strip of size 3 in all possible ways. Finally, 
we add a horizontal strip of size 2 in all possible ways. The coefficient we seek is the number 
of ways that the shape (5,4,3) can be built by making the ordered sequence of choices just 
described. For example, here is one choice sequence that leads to the shape (5, 4, 3): 


fen 

* 

PEER) —- KEY | Coe 
ale 


Here is a second choice sequence that leads to the same shape: 


PPE] - ERR 


Now comes the key observation. We have exhibited each choice sequence by drawing a 
succession of shapes showing the sequential addition of each horizontal strip. The same 
information can be encoded by drawing one copy of the final shape (5,4,3) and putting 
a label in each box to show which horizontal strip caused that box to first appear in the 
shape. For example, the three choice sequences displayed above are encoded (in order) by 
the following three objects: 


We have just drawn three semistandard tableaux of shape (5,4,3) and content (4, 3,3, 2)! 
By definition of the encoding just described, we see that every choice sequence under con- 
sideration will be encoded by some tableau of content (4, 3,3, 2). Since we build the tableau 
by adding horizontal strips one at a time using increasing labels, it follows that the tableau 
we get will always be semistandard. Finally, we can go backwards in the sense that any 
semistandard tableau of content (4,3,3,2) can be built uniquely by choosing a succession 
of horizontal strips that tells us where the 1’s, 2’s, 3’s and 4’s appear in the tableau. To 
summarize these remarks, our encoding scheme proves that the coefficient of 8(5,4,3) in the 
Schur expansion of hia,3,3,2) 1s the number of semistandard tableaux of shape (5,4,3) and 
content (4, 3,3,2). In addition to the three semistandard tableaux already drawn, we have 
the following tableaux of this shape and content: 


So the desired coefficient in this example is six. 
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The argument in the last example generalizes to prove the following result. 


10.69. Theorem: Schur Expansion of h,. Let a = (a1, Q2,...,@s5) be any sequence of 
nonnegative integers with sum k. Then 


ho(@1,---,0N) = S- K),a8)(01,---,0N). 
AE Parn(k) 
(It is also permissible to sum over Par(k) here.) 


Proof. By the Pieri rule, the coefficient of s, in hg is the number of sequences of partitions 


O=w Cw Cw ec---Cpe=r (10.6) 


such that the skew shape p’/y'~! is a horizontal strip of size a;, for 1 < i < s. (This is a 
formal way of describing which horizontal strips we choose at each application of the Pieri 
rule to the product hq.) On the other hand, K),. is the number of semistandard tableaux 
of shape \ and content a. There is a bijection between the sequences (10.6) and these 
tableaux, defined by filling each strip y’/y’! with a; copies of the letter 7. The resulting 
tableau has content a and is semistandard. The inverse map sends a semistandard tableau 
T to the sequence (" : 0 < i < s), where py’ consists of the cells of T containing symbols in 
5 Rls erpernc it O 


10.70. Remark. Suppose a, 3 are sequences such that sort(a@) = sort(3). Note that ha = 
hg since multiplication of polynomials is commutative. Expanding each side into Schur 
polynomials gives 


S- K)a8)(41, see ,UN) = S- Ky, 88)(«1, woe , IN). 

Ek EK 
For N > k, the Schur polynomials appearing here will be linearly independent by 10.49. So 
K).q = K), for all A, in confirmation of 10.33. (This remark leads to an algebraic proof 
of 10.33, provided one first gives an algebraic proof of the linear independence of Schur 
polynomials.) 


10.71. Remark. The previous theorem and remark extend to skew shapes as follows. First, 


Suha = ) Ky /p,o8)- 
A€Parn 


One need only change p° from 0 to y in the proof above. Second, if sort(a) = sort(3), then 
Ay /pja = Kajy,p- 


10.72. Theorem: Complete Homogeneous Basis of A‘. For all k, N € N, the set of 
complete homogeneous polynomials 


{iglBi peng BN) He Paty (Rp ok Pisas eK 
is a basis of the K-vector space Ak,. 


Proof. Consider column vectors S = (s)(@1,...,av) : A € Parn(k)) and H = 
(hy(21,...,0N) :  € Pary(k)), where the entries are listed in lexicographic order. As in 
the proof of 10.49, let K = (K),,,) be the Kostka matrix with rows and columns indexed by 
partitions in Par, (k) in lexicographic order. Recall from 10.47 that K is a lower-triangular 
matrix with 1’s on the main diagonal. In matrix form, 10.69 asserts that H = K’S, where 
K’ is the transpose of the Kostka matrix. This transpose is upper-triangular with 1’s on the 
main diagonal, hence is invertible. Since H is obtained from S by application of an invertible 
matrix of scalars, we see that the elements of H form a basis by the same reasoning used 
in the proof of 10.49 (cf. 10.178). O 
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10.73. Remark. Combining 10.72 with 10.49, we can write H = (K'K)M, where M is 
the vector of monomial symmetric polynomials indexed by Pary(k). This matrix equation 
gives the monomial expansion of the complete homogeneous symmetric polynomials h,,. 


En 


10.14 Schur Expansion of e, 


Now we turn to the elementary symmetric polynomials e,. We can iterate the Pieri rule as 
we did for ha, but here we must add vertical strips at each stage. 


10.74. Example. Let us compute the Schur expansion of €(2,2,2) = €2€2€2. First, ege2 = 
$(1,1)€2 = §(2,2) + $(2,1,1) + $(1,1,1,1)- Next, 


e2e2€2 = [8 3,3) + $(3,2,1) + $(2,2,1,1)] 


+[8(3,2,1) + 8(3,1,1,1) + $(2,2,2) + $(2,2,1,1) + $(2,1,1,1,)] 


+[82,2,1,1) + §(2,1,1,1,1) + §(1,1,1,1,1,1)] 


= 3,3) + 28(3,2,1) + $(3,1,1,1) + $(2,2,2) + 38(2,2,1,1) + 28(2,14) + 818). 


As in the case of ha, we can use tableaux to encode the sequence of vertical strips chosen 
in the repeated application of the Pieri rules. For example, the following tableaux encode 
the three choice sequences that lead to the shape (2,2, 1,1) in the expansion of e222): 


Evidently, these tableaux are not semistandard (column-strict). However, transposing the 
diagrams will produce semistandard tableaux of shape (2,2,1,1)’ = (4,2) and content 


(2,2, 2), as shown here: 


This encoding gives a bijection from the relevant choice sequences to the collection of se- 
mistandard tableaux of this shape and content. So the coefficient of s(2,2,1,1) in the Schur 
expansion of €(2,2.2) is the Kostka number K (4,2) (2.2.2) = 3. This argument generalizes to 
prove the following theorem. 


10.75. Theorem: Schur Expansion of e,. Let a = (a1, @2,...,@s) be any sequence of 
nonnegative integers with sum k. Then 


€a(X1,.--,0N) = a Ky «8 (€1,-.-,0N) = S- Kya$v' (@1,-..-,0N). 
AE Par yn (k) v€Parn (k)’ 


10.76. Remark. We have written Pary(k)’ for the set {X’ : X € Pary(k)}. Since conjuga- 
tion of a partition interchanges the number of parts with the length of the largest part, we 
have 

Pary(k)’ = {v € Par(k) : 1 < N} = {v © Par(k) : vy; < N for all i > 1}. 


It is also permissible to sum over all partitions of k in the theorem, since this will only add 
zero terms to the sum. If the number of variables is large enough (N > k), then we will 
already be summing over all partitions of k. 
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10.77. Theorem: Elementary Basis of AX. For all k,N € N, the set of elementary 
symmetric polynomials 


{e,(@1,...,@N) 2 © Pary(k)’} = {e,(21,...,2N) : wp € Parw(k)} C K[zi,..., rN] 


is a basis of the K-vector space Ate Consequently, the set of all polynomials e . bere 
where the i; are arbitrary nonnegative integers, is a basis of Ay. 


Proof. We use the same matrix argument employed earlier, suitably adjusted to account 
for the shape conjugation. As in the past, let us index the rows and columns of matrices 
and vectors by the partitions in Pary(k), listed in lexicographic order. Introduce column 
vectors S = (s,(v1,...,2N) : A © Pary(k)) and E = (e,/(a1,...,2n) : w € Parn(k)). 
Next, consider the modified Kostka matrix K whose entry in row y and column A is Ky: ,. 
Now 10.75 asserts that 


Cy (1,--.,0N) = S- Ky S3(Ci yp BN): 
AE Parn(k) 


By definition of matrix-vector multiplication, the equations just written are equivalent to 
E = KS. Since the entries of S are known to be a basis, it suffices (as in the proofs of 10.49 
and 10.72) to argue that K isa triangular matrix with 1’s on the diagonal. There are 1’s 
on the diagonal, since K,,,,, = 1. On the other hand, by 10.45, we have the implications 


K(p,A) 40> Ky yw FOS pw IW SAD pS d <tex bs 


So K is lower-triangular. The final statement about the basis of Ay follows by writing 
partitions in Par'y in the form 1°2”---N** and noting that the vector space Ay is the 
direct sum of the subspaces A‘,. O 


10.78. Remark. Combining this theorem with 10.49, we can write E = (KK)M, where M 
is the vector of monomial symmetric polynomials indexed by Pary (k). This matrix equation 
gives the monomial expansion of elementary symmetric polynomials. 


DS 


10.15 Algebraic Independence 


We will use 10.77 to obtain structural information about the ring Ay of symmetric polyno- 
mials in N variables. First we need the following definition. 


10.79. Definition: Algebraic Independence. Let A be a commutative ring containing 
K, and let (21,...,2n) be a list of elements of A. We say that (z1,..., 2) is algebraically 
independent over K iff the collection of monomials 


Qa _ ,a1,a2 Qn, N 
{2% = 2p zg? + 2h sa Ee N™} 


is linearly independent over kK. This means that whenever a finite K-linear combination of 
the z°’s is zero, say 


See S0 (Ca € KX), 


then all c,.’s must be zero. 
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Here is yet another formulation of the definition. Let K[Z,,..., Z|] be a polynomial ring 
in N indeterminates. Given any list (21,...,z) € AN, we get an evaluation homomorphism 
T:K[Z,...,Zn] — A that sends each Z; to z; (see 7.102). One can check that the image 
of T is the subring B of A generated by K and the z;,’s. On the other hand, the kernel 
of T consists precisely of the polynomials }>, caZ® such that S>, caz* = 0. So, the z;’s 
are algebraically independent over K iff ker(T’) = {0} iff T is injective. In this case, T 
(with codomain restricted to B) is a ring isomorphism, and K[Z,..., Zn] = B. So we may 
identify Z; with z; and write B = K[z,..., zn]. 


10.80. Example. Let K[21,...,2.] be a polynomial ring in indeterminates x;. By the 
very definition of polynomials, 5°, cat® = 0 implies all cq are zero. So the indeterminates 
%1,...,@y are algebraically independent over K. The evaluation map T above can be taken 
to be the identity function on K[21,...,2n]. On the other hand, consider the three poly- 
nomials z] = 21 + @2, 22 = x7 + x3, and z3 = x? + x3. The elements 21, z2, z3 are linearly 
independent over kK, as one may check. However, they are not algebraically independent 
over K, because of the relation 


1s? — 32129 + 2z3 = 0. 
Later, we will see that z, and z2 are algebraically independent over Kk. 


By 10.77, Aw is the subring of K[x1,...,xy] generated by K and the elementary sym- 
metric polynomials. Combining the last part of 10.77 with 10.79, we deduce the following 
structural result. 


10.81. Fundamental Theorem of Symmetric Polynomials. The elementary sym- 
metric polynomials 


{ei(t1,...,an):1<t< N}C K[ay,...,2N] 


are algebraically independent over K. Furthermore, if K[E),...,£y] is another polynomial 
ring, then the evaluation map T : K[F£i,...,En] — Aw sending E; to e;(%,...,7N) 
is an isomorphism of rings and K-vector spaces. So, for every symmetric polynomial 
f(a1,..-,@n), there exists a unique polynomial g(Fi,...,£x) such that f = T(g) = 
g(é1,---,€N)- 


10.82. Remark. An algorithmic proof of the existence assertion in the fundamental theo- 
rem is sketched in 10.211. 


10.16 Power-Sum Symmetric Polynomials 


Recall that the power-sum symmetric polynomials in N variables are defined by setting 
N 

DAME .6e EN) HN z® for all k > 1 and pa(21,...,2N) = []js1 Pa; (a1,...,2N). It 

turns out that the polynomials (pi,...,py) are algebraically independent over kK’. One way 

to prove this is to invoke the following determinant criterion for algebraic independence. 


10.83. Theorem: Determinant Test for Algebraic Independence. Let g,...,9n 
be N polynomials in K[21,...,2.]. Let A be the N x N matrix whose j, k-entry is the 
formal partial derivative Djgx, = Og,/Ox; (see 7.103), and let J € K[x1,...,vNn] be the 
determinant of A (see 9.37). If J £ 0, then g1,...,gn are algebraically independent over 
K. 
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Proof. We prove the contrapositive. Assume gi,...,gn are algebraically dependent over 
kK. Then there exist nonzero polynomials h € K[Z,,...,Zy] such that h(g1,...,gn) = 0. 
Choose such an h whose total degree in the Z;’s is as small as possible. We can take the 
partial derivative of h(gi,...,gn) with respect to x; by applying the formal multivariable 
chain rule (see 7.104), obtaining the relations 


N 
0 . 
k=1 Ox; 


Let v be the column vector whose kth entry is (Dxh)(g1,.-., gn). The preceding relations 
are equivalent to the matrix identity Av = 0. We will show that v is not the zero vector. 
Since h € K[Z,...,Zn] is nonzero and K contains Q, at least one partial derivative 
Dyh € K[Z%4,...,Zn] must be nonzero. Given such a k with Dyh # 0, the total degree of 
D,h in the Z,;’s must be lower than the total degree of h. By choice of h, it follows that 
(Dxh)(g1,.--;gn) is nonzero in K[21,...,2~]. This polynomial is the kth entry of v, so 
v £0. Now Av = 0 forces A to be a singular matrix, so J = det(A) = 0 by a theorem of 
linear algebra. O 


10.84. Remark. The converse of 10.83 is also true: if g1,...,gn are algebraically indepen- 
dent, then the “Jacobian” J will be nonzero. This fact is not needed in the sequel, so we 
omit the proof. 


10.85. Theorem: Algebraic Independence of Power-Sums. Let K be a field con- 
taining Q. The power-sum polynomials 


{Eels mall < k<N}C K[a1,..+,2y] 
are algebraically independent over Kk. 


Proof. We use the determinant criterion in 10.83. The j, k-entry of the matrix A is 


6) = 
Djpe = 5 (@i tag te bap te bay) = hay 1 
j 


Accordingly, J = det [rare ln <p Ren: For each column k, we may factor out the scalar k 
to see that J = N! det eee ||. The resulting determinant is called a Vandermonde determi- 
nant. This determinant evaluates to + ]],<,c2,<y(@r — @s), which is a nonzero polynomial 


(see §12.9 for a combinatorial proof of this formula). Since K contains Q, the scalar N! is 
not zero in Kk. We conclude that J 4 0, which proves the result. oO 


Now that we know that the p,’s are algebraically independent, we can obtain power-sum 
bases for the vector spaces Ak, and Ay. 


10.86. Theorem: Power-Sum Basis. Let K be a field containing Q. For all k, N € N, 
the collection 


{pu(t1,---,0N) 2 € Pary(k)’} C K[x,...,2n] 


is a basis of the K-vector space A‘,. The collection {pi . pw : 4; > O} is a basis of the 


K-vector space Ay. Letting P,,..., Px be new indeterminates, there is an isomorphism 
of rings and K-vector spaces T : K[P\,..., Pn] — Aw such that T(P;) = p;(ai,...,@N). 
So, for every symmetric polynomial f(1,...,2,’), there exists a unique polynomial g with 


f = g(pi,.--, Pw). 
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10.17 Relations between e’s and h’s 


We have seen that, in the polynomial ring K[a1,...,vy], the lists (e1,...,en) and 
(p1,---,;PN) are each algebraically independent. One might wonder if the polynomials 
(hi,...,hn) are also algebraically independent over K. This would follow (as it did for 
the e’s) if we knew that {h,, :  € Parny(k)’} was a basis of A‘, for all k. However, the basis 
we found in 10.72 was {h,, : uw € Pary(k)}, which is indexed by partitions of k with at most 
N parts, instead of partitions of k with each part at most N. The next result will allow us 
to overcome this difficulty by providing equations relating e1,...,en to hy,...,hy. 


10.87. Theorem: Recursion for e;’s and h;’s. For all m, N € N, we have the identity 


So (-1'ei(#1,..-,2N)hm—i(@1,--.,2N) = x(m = 0). (10.7) 


i=0 


Proof. If m = 0, the identity reads 1 = 1, so let us assume m > 0. We can model the left 
side of the identity using a collection Z of signed, weighted objects. A typical object in Z 
is a triple z = (i, S,T), where 0 < i < m, S € SSYTy((1’)), and T € SSYTy((m — i)). 
The weight of (i,$,T) is 2°) x°), and the sign of (i,.$,T) is (—1)*. For example, taking 
N =9 and m =7, a typical object in Z is 


a= ( a 1313/4] 6] : 


The signed weight of this object is (—1)?(aov4x7)(a324%6) = —X2x3x72607. Recalling that 
€; = 84) and hm_; = $(m—i), we see that the left side of (10.7) is precisely 

ys sgn(z) wt(z). 
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To prove this expression is zero, we define a sign-reversing involution I: Z — Z with no 
fixed points. Given z = (i,S,T) € Z, we compute I(z) as follows. Let 7 = S((1,1)) be the 
smallest entry in S, and let k = T((1,1)) be the leftmost entry in T. If i = 0, then S is 
empty and 7 is undefined; if 1 = m, then T is empty and & is undefined. Since m > 0, at 
least one of 7 or k is defined. If 7 < k or k is not defined, move the box containing j from 
S to T, so that this box is the new leftmost entry in 7’, and decrement i by 1. Otherwise, 
if k <j or 7 is not defined, move the box containing & from T to S, so that this box is the 
new topmost box in $, and increment i by 1. For example, if z is the object shown above, 
then 


4 
1) = (2, 4 pont). 
As another example, 
7((0, 0, [2[2]3]5]5[7]9)) = 0, (2) 2[3]5]5]7[9)- 


From the definition of J, we can check that J does map Z into Z, that Jo I = idz, that I 
is weight-preserving and sign-reversing, and that J has no fixed points. O 


10.88. Theorem: Algebraic Independence of h’s. For all k, N € N, the collection 


{ig ices i en) : pe € Pary(k)’} Cris een 
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is a basis of the K-vector space AX,. Consequently, the he : i; > O} is a basis 
of the K-vector space Ay, and (h1,...,hy) is algebraically independent over K. Let- 


ting H,,...,Hy be new indeterminates, there is a ring and K-vector space isomorphism 
T: K[M,...,Hn] — Aw given by T(H;) = hi(a1,...,2n). So, for every symmetric poly- 
nomial f(a#1,...,¢), there exists a unique polynomial g with f = g(hi,..., hw). 


Proof. Tt suffices to prove the statement about the basis of AX,, from which the other 
assertions follow. By 10.72, we know that 


[{hyu(ai,...,¢@n~): uw © Parn(k)’}| < |{hy(21,-..,¢2~) : w € Parn(k)}| = dimx(A‘,). 


So, by a theorem of linear algebra, it is enough to prove that {h, : uw © Pary(k)’} is a 
spanning set of A‘,. For each k > 0, let V& be the vector subspace of A‘, spanned by these 
h,’s. We want to prove V& = AK for all k. It will suffice to show that ej!---eXY ¢ VE 
for all 21,...,7~1 that sum to k, since these elementary symmetric polynomials are known 
to be a basis of Ak. Now, one can check that f € Vx; and g € V,” imply fg € Ver 
(This holds when f and g are each products of hi,...,h~, and the general case follows by 
linearity and the distributive law.) Using this remark, we can further reduce to proving that 
€;(t1,...,2n) € Vx for l<j<N. 

We prove this by induction on j. The result is true for 7 = 1, since ey = ys lp = 
hy € Vx. Assume 1 < j < N and the result is known to hold for all smaller values of j. 
Taking m = j in the recursion (10.7), we have 


ej = ej—-1hy = ej-2h2 + ej—3hz Se Sia eyhj_1 +- hj. 


Since e;_, € Vi * (by induction) and h, € Vg (by definition) for 1 < s < j, each term on 
the right side lies in Vj. Since Vy, is a subspace, it follows that e; € Vx, completing the 
induction. O 


10.18 Generating Functions for e’s and h’s 
Another approach to the identities (10.7) involves generating functions. 
10.89. Definition: Ey(t) and Hy(t). For each N > 1, define the polynomial 


N 
By (t) = [G+ ait) € F(a1,...,2n)[ 


i=1 
and the formal power series 
1 
Hy(t)= |] Toy € Fm. ew) [lel 


eat Nee 
t=1 


10.90. Theorem: Expansion of Ey (t). For all N > 1, we have 
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Proof. Let us use the generalized distributive law 2.7 to expand the product in the definition 
of Ey (t). We obtain 


N 


Ex(t)=[Jat+at)= S> J] J]. 


i=1 SC{1,2,...,N}iES igS 


To get terms involving t*, we must restrict the sum to subsets 9 of size k. Such subsets 
can be identified with increasing sequences 1 < i, < ig < +--+: < ix < N. Therefore, the 
coefficient of t* in Ey (t) is 


Li, Vig Vi, = Ck(@1,---, LN). Oo 
1<i1 <ig<-<ip SN 


10.91. Theorem: Relation between Roots and Coefficients of a Polynomial. 
Suppose a polynomial p = XN +a,XN-14.--+a;X%N7*+---+an_1X tay € K[X] 
factors as p = (X —11)(X —1r2)---(X — rw) for some nr; € K. For l <i<N, 


ay = (—1)*e;(ri, re, a TN): 


Proof. One can prove this by expanding (geen (X —r;) with the generalized distributive law. 
Alternatively, we can deduce the result from 10.90 as follows. Replacing t by 1/X and 2; 
by —r; in Ey(t) gives TJ, (1 — ri/X) = X~p. Using 10.90, we conclude that 


n N 
p= x ye ex(—"1, Saeey —ry)X-* _ So (-1)Fes(ri, oe enya 
k=0 k=0 
Taking the coefficient of X‘~* gives the result. O 


10.92. Theorem: Expansion of Hy(t). For all N > 1, we have 


Hy (t) 7 So hg (21,. ei ,in)t*. 
k=0 


Proof. Using the geometric series formula, we have 


Next, using the generalized distributive law, we get 


N N 
He +, eee SS geen les 
i=l 


(j1)-43n ENN i=1 (i1y--.5N JENN 
The coefficient of t* consists of the sum of all possible monomials in x71,...,2y of degree 
k, which is precisely hy(a1,...,@N). oO 


Now we can give an algebraic proof of 10.87. Note that 


N N 


Hy (t)En(-t) = []] —— JJ- ait) =1. 


i=l (1 — ait) i=l 


Equating the coefficients of t™ on both sides gives the identities (10.7). 
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10.19 Relations between p’s, e’s, and h’s 


In this section, we will study recursions similar to (10.7) that relate the h,,’s and e,,’s to 
the p;’s. These recursions can be used to deduce the algebraic independence of the p,;’s 
from the algebraic independence of the h,,’s (or e,,’s), and vice versa, by adapting the proof 
of 10.88 to the new recursions. 


10.93. Theorem: Recursion for h;’s and p,;’s. For all n, N > 1, the following identity 
is valid in Ay: 
hopn + hipn—1 + hopn—2+-+°++hn-1p1 = nhn. (10.8) 


Proof. Let us interpret each side of the desired equation as the generating function for 
a suitable collection of weighted objects. For the left side, let X be the set of all triples 
(k,T,U), where: 0 << k <n; T € SSYTy((k)); and U consists of a row of n — k boxes all 
filled with the same integer i < N. The weight of such a triple is «(+e = pg’, 


For example, letting n = 8 and N = 9, here is a typical object in X of weight x?x2x323: 


20 = (5, [1 [12] 4]4}, ‘ 


For a fixed value of k, the generating function for the possible T’s is hy(a1,...,x2n) and the 
generating function for the possible U’s is pn_z(@1,...,2n). By the sum and product rules 
for weighted sets, the left side of (10.8) is the generating function for X. 

Now let Y be the set of all pairs (V,7), where V € SSYTy((n)) and 1 <j <n. We can 
visualize an object in Y as a semistandard tableau of shape (n) in which the jth cell has 


been marked. For example, here is a typical object in Y of weight «7x32: 


yo =(1]1]3)3*]3]3]3] 4} 


The generating function for the weighted set Y is nh,(x1,...,@yN). 

To prove (10.8), it suffices to define a weight-preserving bijection f : X — Y. Given 
(k,T,U) € X, note that U consists of a run of n — k copies of some symbol i. To compute 
f((k,T,U)), mark the first box in U and splice the boxes of U into T in the appropriate 
position to get a weakly increasing sequence. If T already contains one or more 2’s, the first 
box of U is inserted immediately after these 2’s. For example, 


f(2o) = LL [2}3]3]4] 4} 


This insertion process is reversible, thanks to the marker. More precisely, define g: Y — X 
as follows. Given (V,7) € Y, let 2 be the entry in the jth cell of V. Starting at cell 7 and 
scanning right, remove each cell equal to 7 from V to get a pair of tableaux T and U as in 
the definition of X. Define g((V,7)) = (k,T,U), where k is the number of boxes in T. For 


example, 
g(yo) = (4, [2113] 4} [3[3]3]3). 


One may check that f and g are weight-preserving and are two-sided inverses of each 
other. O 


10.94. Theorem: Recursion for e;’s and p,;’s. For all n, N > 1, the following identity 
is valid in Ayn: 


€oPn — €1Pn-1 + €2Pn-2 — "°° AE (-1)"~"en_1p1 == (—1)""*nen. (10.9) 


414 Bijective Combinatorics 


Proof. This time we interpret each side of the equation using suitable signed, weighted 
objects. For the left side, let X be the set of all triples (k,T,U), where: 0 < k < n; 
T € SSYTy((1*)); and U consists of a row of n — k boxes all filled with the same integer 
i < N. The weight of this triple is 2°)+¢™), and the sign of this triple is (—1)*. For 
example, here is a typical object of X whose signed weight is (—1)*ror}ax527: 


Using the sum and product rules for weighted sets, one sees that }),-y sgn(z) wt(z) is the 
left side of (10.9). 
Now let Y = {(7,7) : T € SSYTy((1")),1 < 7 < n}. We can think of each element 


of Y as a strictly increasing sequence of n elements of {1,2,...,N} in which one of the 
elements (the jth one) has been marked. The generating function for the weighted set Y is 
N€n(@1,...,0N). 


Let us define a weight-preserving, sign-reversing involution I: X > X. Fix (k,T,U) € 
X. Since k <n, U is not empty; let 7 be the integer appearing in each box of U. The map 
TI acts as follows. On one hand, if k <n—1 and j does not appear in 7, then increase k by 
1, remove one copy of 7 from U, and insert this number in the proper position in T to get a 
sorted sequence. On the other hand, if 7 does appear in 7’, then decrease k by 1, remove the 
unique copy of 7 from T’, and place another copy of 7 in U. If neither of the two preceding 
cases occurs, (k,T,U) is a fixed point of I. For example, 


I(z0) = (s Ey 
7 


One can check that I is a well-defined, weight-preserving, sign-reversing involution on X. 

Let Z be the set of fixed points of J. We see from the description of J that Z consists of 
all triples (n—1,T,[7]) where j does not appear in T. All of these triples have sign (—1)"~!. 
The proof will be complete if we can find a weight-preserving bijection g : Z — Y. We define 
g by inserting a marked copy of 7 into its proper position in the increasing sequence T’. The 
inverse map takes an increasing sequence of size n with one marked element and removes 
the marked element. For example, 


10.20 Power-Sum Expansion of h,, and e, 


We can use the recursions in 10.93 and 10.94 to compute expansions for h, and e, in terms 
of the power-sum symmetric polynomials p,,. 


10.95. Example. We know that ho = 1 and hi = pi. Next, since hop2 + hipi = 2h2, we 


Tableaux and Symmetric Polynomials 415 
find that hz = (p2) + pai) /2- For n = 3, we have 

hops + hips + hop, = 3hz, 
so that 


po +p? 


9 | n) = (1/3)p(3) + (1/2) p21) + (1/6)P(1,1,1)- 


1 
ha= 5 (ps tripe + | 


For n = 4, we use the relation 
hopa + hip3 + hop + h3pi = 4ha 
to find, after some calculations, 
ha = (1/4)pcay + (1/8) p(3,1) + (1/8) (2,2) + (1/4) (2,151) + 1/24) 1,1,1)- 
These formulas become nicer if we multiply through by n!. For instance, 


3!h3 = 23) + 8p(2,1) + 1p,1,1); 
Mhg = 6pa) + 8p(3,1) + 8P(2,2) + 6P2,1,1) + 1Pa,1,1,1)- 


Similar formulas can be derived for n!e,, but here some signs occur. For instance, calcula- 
tions with (10.9) lead to the identities 


3!e3. = 2p3) — 3p(2,1) + 1p(1,1,1); 
Aleg = —6pcay + 8p(3,1) + 8P(2,2) — 6P2,1,1) + 1P,1,1,1)- 


By comparing the coefficients in the power-sum expansion of 4!h4 to the entries in 
Table 9.1, the reader may be led to conjecture the following result. 


10.96. Theorem: Power-Sum Expansion of h,,. For alln, N > 1, the following identity 
is valid in Ay: 
Min = S> (n!/2u)Pu- (10.10) 


we Par(n) 


Proof. Recall from 9.134 that n!/z, is the number of permutations o € S,, with cycle type 
p. This suggests the following combinatorial interpretations for the two sides of (10.10). 
The left side counts all pairs (w,T), where w = wiwe-+- Wn € Sp, is a permutation written 
in one-line form and T = (i, < ig <--- <i,) is an element of SSYT~((n)). Let X be the 
set of all such pairs, weighted by the content of J’. For example, here is a typical element 
of X when n = 8, written as a two-rowed array: 
Sigs 425 8 3 7 1 6 
oe ites Se ae 9 a | 

The right side of (10.10) counts all triples (u,o,C), where u € Par(n), 0 € Sy, isa 
permutation with cycle type wu, and C’: {1,2,...,n} — {1,2,...,N} is a coloring of the 
numbers 1,...,n using N available colors such that all elements in the same cycle of o are 
assigned the same color (cf. §9.19). Let the weight of such a triple be []j;_, xc(x), and let 
Y be the set of all such weighted triples. For example, a typical element of Y is the triple 


i= (3.2.2.0) (1, 6, 3)(2, 5)(7, 4)(8), ( ; ; : 
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To see why the factor p,(x1,...,%y) arises, consider how we may choose the coloring 
function C once y and o have been selected. Now, we know o is a product of cycles of 
lengths py, f12,..., 4j- Choose the common color of the elements in the first cycle in any of 
N ways. Since j1; elements all receive the same color, the generating function for this choice 
is vl +ah'+-+-+a} = py, (1,...,¢Nn). Next, choose the common color of the elements in 
the second cycle, which gives a factor of p,,, and so on. Multiplying the generating functions 
for these choices gives py(%1,...,@N). 

To complete the proof, we define weight-preserving maps f: Y — X andg: X —~ Y 
that are inverses of each other. To understand the definition of f, recall that a given a € Sy, 
can be written in cycle notation in several different ways, since the cycles can be presented 
in any order, and elements within each cycle can be cyclically permuted. Given (ju, 0,C), 
we will specify one particular cycle notation for o that depends on C, as follows. First, 
cycles colored with smaller colors are to be written before cycles colored with larger colors. 
Second, elements within each cycle are cyclically shifted so that the first element in each 
cycle is the smallest element appearing in that cycle. Third, if there are several cycles that 
have the same color, these cycles are ordered so that their minimal elements decrease from 
left to right. For example, starting with the object yo above, we obtain the following cycle 
notation for a: (2,5)(8)(4, 7)(1, 6,3). Note that (2,5) is written first because this cycle has 
color 2. The other cycles, which are all colored 3, are presented in the given order because 
8 >4> 1. Finally, to compute f((u,0,C)), we erase the parentheses from the chosen cycle 
notation for o and write the color C(i) directly beneath each i in the resulting word. For 


example, 
_|w: 25 8 4 7 1 6 38 
f¥o)=| 7: 22333333)" 
One may check that f is well-defined, maps into X, and preserves weights. 

Now consider how to define the inverse map g : X — Y. Given (w,T) € X with 
w=wi---w, and T =i; <--- < in, the coloring map C is defined by setting C(w,;) = i; 
for all 7. To recover o from w and T, we need to add parentheses to w to recreate the 
cycle notation satisfying the rules above. For each color 7 in turn, look at the substring 
of w consisting of the symbols located above the 2’s in T. Scan this substring from left to 
right, and begin a new cycle each time a number is encountered that is smaller than all 
of the preceding numbers in this substring. (The numbers that begin new cycles will be 
called left-to-right minima relative to color i.) This procedure defines o, and finally we set 
ps = type(c). For example, 


alee) = (221110, M29@08000(9 > 9 1932 9)): 


The reader should check that g(f(yo)) = yo and f(g(zo)) = 20. One can similarly verify 
that go f =idy and f og = idx, so the proof is complete. O 

Before considering the analogous theorem for e,, we introduce the following notation. 
10.97. Definition: The Sign Factor ¢,,. For every partition uF n, let 


£(u) 


eo = (Cay = Te. 


i=1 
We proved in 9.34 that €,, = sgn(o) for all a € S, such that type(c) = p. 


10.98. Theorem: Power-Sum Expansion of e,,. For all n, N > 1, the following identity 
is valid in Ayn: 
nlen = S- En (1!/Zy)Py- (10.11) 
we Par(n) 
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Proof. We use the notation X, Y, f, g, 2, and yo from the proof of 10.96. Recall that 
Den (2!/2.)Py is the generating function for the weighted set Y. To model the right side 
of (10.11), we need to get the sign factors €,, into this sum. We accomplish this by assigning 
signs to objects in Y as follows. Given (,0,C’) € Y, write o in cycle notation as described 
previously. Attach a + to the first (minimal) element of each cycle, and attach a — to the 
remaining elements in each cycle. The overall sign of (~,0,C) is the product of these signs, 
which is [],(—1)#‘~' = e,,. For example, the object yo considered previously is now written 


wo = (8.2.2.1), (2+, 57)(8*)(4*,7-)*, 6-37), C 5 ; 3 : ; : a) 


the sign of this object is (—1)4 = +1. 

The next step is to transfer these signs to the objects in X using the weight-preserving 
bijection f : Y > X. Given (w,T) € X, find the left-to-right minima relative to each color 
i (as discussed in the definition of g in the proof of 10.96). Attach a + to these numbers 
and a — to all other numbers in w. For example, f(yo) is now written 

2 RR A SBS OREO er ae aI Beg 
Flo) =| 2 <O:. By MBs 38 Har 8 ‘al 


As another example, 
: EB : Se a a a Sa OS a 
Oo= 


ES ds a OY ae LD adie 2S 


The bijections f and g now preserve both signs and weights, by the way we defined signs 
of objects in X. It follows that )7(,, r)<ex sgn((w, T)) wt((w, T)) is precisely the right side 
of (10.11). 

Now we define a sign-reversing, weight-preserving involution I : X — X. Fix (w,T) € X. 
If all the entries of T are distinct, then (w, 7) will be a fixed point of I. We observe at once 
that such a fixed point is necessarily positive, and the generating function for such objects is 
precisely nle,(a1,...,¢~). On the other hand, suppose some color 7 appears more than one 
time in T’. Choose the smallest color 7 with this property, and let wz, wz+1 be the first two 
symbols in the substring of w located above this color. Define I((w,T)) by switching w; and 
Wxr+1; one checks that this is a weight-preserving involution. Furthermore, one may verify 
that switching these two symbols will change the number of left-to-right minima (relative 
to color 7) by exactly 1. For example, 

w: Br 2h Br Ar 7 IF 6 37 
pwe)) = | 2° <r 9 3B! Be a 8 8 | 


As another example, 


iG\= w: 2+ 4- 5- gt 3+ 7 1+ 6Ft 
Ogee di. A hy Se De ae Oe 


In general, note that wz is always labeled by +; wz+1 is labeled + iff wy, > we41; and the 
signs attached to numbers following wz+41 do not depend on the order of the two symbols 
Wk, We+1- We have now shown that I is sign-reversing, so the proof is complete. O 


DT 


10.21 The Involution w 


Recall from §10.16 that p,,...,pn~ © Aw are algebraically independent over K, so that 
we can view Ay as the polynomial ring K[pi,...,pw]. By the universal mapping property 
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for polynomial rings (see 7.102), we can define a ring homomorphism with domain Ay by 
sending each p; to an arbitrary element of a given ring containing kK. We now apply this 
technique to define a certain homomorphism on Ay. 


10.99. Definition: The Map w. Let w: Aw — Aw be the unique K-linear ring homo- 
morphism such that w(p;) = (—1)*~!p; for 1 <i< N. 


10.100. Theorem: w is an Involution. We have w? 


automorphism of Ay. 
Proof. Observe that w?(p;) = w(w(pi)) = w((-1)*"!p;) = (—1)*-1(-1)*" 1p; = pi = id(p;) 
for 1 <i <_N. Since w? and id are ring homomorphisms on Ay that fix each c € K and 


have the same effect on every p;, the uniqueness part of the UMP for polynomial rings 
shows that w? = id. Since w has a two-sided inverse (namely, itself), w is a bijection.  O 


= ida,; in particular, w is an 


Let us investigate the effect of w on various bases of Ay. 


10.101. Theorem: Action of w on p’s, h’s, and e’s. Suppose v € Pary(k)’, sov; < N for 
all 7. The following identities hold in Ay: (a) w(py) = evpy; (b) w(hy) = ev; (c) w(ev) = Av. 


Proof. (a) Since w is a ring homomorphism and 1; < N for all 4, 


ev) ev) ev) 


w(py) =W II». = Il w (py; ) zs ][(-)"*e.. = €yPv- 


i=l i=l 


(b) First, for any n < N, we have 


w(hn) =w > ee Pal = by 2 Oi) = a is ao 


pe Par(n) uwePar(n) pe Par(n) 
by 10.96 and 10.98. Since w preserves multiplication, w(h,) = e, follows. 

(c) Part (c) follows by applying w to both sides of (b), since w? = id. O 
10.102. Theorem: Action of w on s). If A € Par(n) and n < N, then w(s,) = sy in 
An. 

Proof. From 10.69, we know that for each wr n, 


Ap(t1,...,0n) = S- Ky $)(@1,-..,0N). 
AE Par(n) 


We can combine these equations into a single vector equation H = K‘S where H = (h,, : 

pe € Par(n)) and S = (s, : A € Par(n)). Since K* (the transpose of the Kostka matrix) is 

unitriangular and hence invertible, S = (K‘)~1H is the unique vector v satisfying H = K‘v. 
From 10.75, we know that for each wk n, 


€p(t1,...,0nN) = Si Ky $y (@1,...,0N). 
AE Par(n) 
Applying the linear map w to these equations produces the equations 
hy = S- Fy (8°). 
AE Par(n) 


But this says that the vector v = (w(sy/) : A € Par(n)) satisfies H = K‘v. By the uniqueness 
property mentioned above, v = S. So, for all A € Par(n), 5, = w(s’). Replacing » by ’ 
(or applying w to both sides) gives the result. O 
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What happens if we apply w to the monomial basis of Ay? Since w is a K-linear bijection, 
we get another basis of Ay that is different from those discussed so far. This basis is hard 
to describe directly, so it is given the following name. 


10.103. Definition: Forgotten Basis for Ay. For each \ € Pary, define the forgotten 
symmetric polynomial fet, = w(m,). The set {fgt, : \ € Pary(k)} is a basis of A‘, 


"| Fe 


10.22 Permutations and Tableaux 


Iteration of the tableau insertion algorithm (§10.9) leads to some remarkable bijections 
that map permutations, words, and matrices to certain pairs of tableaux. These bijections 
were studied by Robinson, Schensted, and Knuth, and are therefore called RSK correspon- 
dences. We begin in this section by showing how permutations can be encoded using pairs 
of standard tableaux of the same shape. 


10.104. Theorem: RSK Correspondence for Permutations. There is a bijection 
RSK : Sp > Uepariny SYT) x SYT(A). Given RSK(w) = (P(w), Q(w)), we call P(w) 
the insertion tableau for w and Q(w) the recording tableau for w. 


Proof. Let w € Sy, have one-line form w = w ,w2---w,. We construct a sequence of tableaux 
Po, Pi,..., Pn = P(w) and a sequence of tableaux Qo, Qi,..-,Qn = Q(w) as follows. 
Initially, let Po and Qo be empty tableaux of shape (0). Suppose 1 <i <n and Pi-1,Qi-1 
have already been constructed. Define P; = Pj-1 — w; (the tableau obtained by insertion 
of w; into P;). Let (a,b) be the new cell in P; created by this insertion. Define Q; to be 
the tableau obtained from Q;-1 by placing the value 7 in the new cell (a,b). Informally, we 
build P(w) by inserting w,,..., Ww, (in this order) into an initially empty tableau. We build 
Q(w) by placing the numbers 1,2,...,n (in this order) in the new boxes created by each 
insertion. By construction, Q(w) has the same shape as P(w). Furthermore, since the new 
box at each stage is a corner box, one sees that Q(w) will be a standard tableau. Finally, 
set RSK(w) = (P(w), Q(w)). 

To see that RSK is a bijection, we present an algorithm for computing the inverse map. 
Let (P,Q) be any pair of standard tableaux of the same shape » € Par(n). The idea is 
to recover the one-line form w,--+w, in reverse by uninserting entries from P, using the 
entries in Q to decide which box to remove at each stage (cf. §10.10). To begin, note that 
n occurs in some corner box (a, 6) of Q (since Q is standard). Apply reverse insertion to P 
starting at (a,b) to obtain the unique tableau P,-1 and value w, such that P,-1 <— wp is 
P with new box (a,b) (see 10.60). Let Qn—1 be the tableau obtained by erasing n from Q. 
Continue similarly: having computed P; and Q; such that Q; is a standard tableau with i 
cells, let (a,b) be the corner box of Q; containing i. Apply reverse insertion to P; starting 
at (a,b) to obtain P;_; and w;. Then delete i from Q; to obtain a standard tableau Qj-1 


with i—1 cells. The resulting word w = wi w2--- Ww, is a permutation of {1,2,...,n} (since 
P contains each of these values exactly once), and our argument has shown that w is the 
unique object satisfying RSK(w) = (P,Q). So RSK is a bijection. O 


10.105. Example. Let w = 35164872 © Sx. Figure 10.1 illustrates the computation 
of RSK(w) = (P(w), Q(w)). As an example of the inverse computation, let us determine 
the permutation v = RSK~'(Q(w), P(w)) (note that we have switched the order of the 
insertion and recording tableaux). Figure 10.2 displays the reverse insertions used to find 
Un; Un—1;-++,U1. We see that v = 38152476. 


420 Bijective Combinatorics 


Insertion Tableau Recording Tableau 


insert 3: 
insert 5: 
insert 1: 


insert 6: [115 ]6] 


insert 4: 


insert 8: 4 


insert 7: 
[315] 8) 


[1] 2/6] 7] [1|2| 4/6] 
insert 2: 1314/8] 
[8 | 


FIGURE 10.1 
Computation of RSK(35164872). 


Let us compare the two-line forms of w and v: 
_f12 3 4 5 6 7 8). _f12 3 45 67 8 
TT a Bebe TO AR Eee. ORONO RS 4d obec eee aga. 
We see that v and w are inverse permutations! 


The phenomenon observed in the last example holds in general: if w +> (P,Q) under the 
RSK correspondence, then w~! ++ (Q, P). To prove this fact, we must introduce a new way 
of visualizing the construction of the insertion and recording tableaux for w. 


10.106. Definition: Cartesian Graph of a Permutation. Given a permutation w = 
W1W2°++*Wn € Sp, the graph of w (in the ry-plane) is the set G(w) = {(i, wi): 1<i<n}. 


For example, the graph of w = 35164872 is drawn in Figure 10.3. 
To analyze the creation of the insertion and recording tableaux for RSK(w), we will 
annotate the graph of w by drawing lines as described in the following definitions. 


10.107. Definition: Shadow Lines. Let S = {(x1,y1),...,(@x,yx)} be a finite set of 
points in the first quadrant. The shadow of S is 


Shd(S) = {(u,v) € R? : for some i, u > a; and v > y}. 


Informally, the shadow consists of all points northeast of some point in S. The first shadow 
line L,(S) is the boundary of Shd($). This boundary consists of an infinite vertical ray 
(part of the line x = aj, say), followed by zero or more alternating horizontal and vertical 
line segments, followed by an infinite horizontal ray (part of the line y = bi, say). Call a, 
and 6; the x-coordinate and y-coordinate associated to this shadow line. Next, let 5; be the 
set of points in S that lie on the first shadow line of S. The second shadow line L2(S) is 
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Insertion Tableau Recording Tableau Output Value 


a [1 {2/416 [1 {2/67 
initial tableau: 13/4] 8] 
8 | 
[1 [2/6] 7) 
uninsert 7: 6 
8 | 
| BIPIES TTS] 
uninsert 7: 7 
[8 | 
uninsert 4: 4 
3 
1 
uninsert 8: LL 2 
[3 [8] 
uninsert 8: =f 5 
uninsert 3: 1318] 1 
uninsert 8: 8 
uninsert 3: empty empty 3 
FIGURE 10.2 
Mapping pairs of standard tableaux to permutations. 
Bt: e 
eee eee e 
eee ome e 
ene ee e 
eed eee e 
oo meee 
oe e 
oe e 


FIGURE 10.3 
Cartesian graph of a permutation. 
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FIGURE 10.4 
Shadow lines for a permutation graph. 


the boundary of Shd(S ~ S,), which has associated coordinates (dz, b2). Letting Sz be the 
points in S that lie on the second shadow line, the third shadow line L3(S) is the boundary 
of Shd(S' ~ ($1 U $2)). We continue to generate shadow lines in this way until all points of 
S lie on some shadow line. Finally, the first-order shadow diagram of w € Sy, consists of all 
shadow lines associated to the graph G(w). 


10.108. Example. The first-order shadow diagram of w = 35164872 is drawn in Figure 
10.4. The x-coordinates associated to the shadow lines of w are 1, 2,4, 6. These x-coordinates 
agree with the entries in the first row of the recording tableau Q(w), which we computed 
in 10.105. Similarly, the y-coordinates of the shadow lines are 1,2,6,7, which are precisely 
the entries in the first row of the insertion tableau P(w). The next result explains why 
this happens, and shows that the shadow diagram contains complete information about the 
evolution of the first rows of P(w) and Q(w). 


10.109. Theorem: Shadow Lines and RSK. Let w € S,, have first-order shadow lines 
Ly,..., Ly with associated coordinates (x1, y1),.--, (Uk, ye). Let Po, Pi,..., Pr = P(w) and 
Qo, Q1,---;Qn = Q(w) be the sequences of tableaux generated in the computation of 
RSK(w). For 0 <i <n, the y-coordinates of the intersections of the shadow lines with the 
line « = i+ (1/2) are the entries in the first row of P;, whereas the entries in the first row of 
Q; consist of all 2; < i. Whenever some shadow line L, has a vertical segment from (i, a) 
down to (i, b), then b = w; and the insertion P; = P;_1 — w; bumps the value a out of the 
rth cell in the first row of P;_1. 


Proof. We proceed by induction on i > 0. The theorem holds when 7 = 0, since Po and 
Qo are empty, and no shadow lines intersect the line x = 1/2. Assume the result holds for 
i—1 <n. Then the first row of Pj_1 is a, < ag <-+-: < aj, and the a’s are the y-coordinates 
where the shadow lines hit the line x = i — 1/2. Consider the point (7, w,), which is the 
unique point in G(w) on the line x = i. First consider the case w; > a;. In this case, the 
first 7 shadow lines all pass underneath (i, w;). It follows that (i, w;) is the first point of 
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7+ 7+ 

6+ 6+ 

5+ 5+ 

4+ 4+ 

3+ 3+ 

2+ 2+ 

1+ 1+ 
j_}_—}__}__+_+_}__j j—}——}-__}__}-__}-_} 
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FIGURE 10.5 


Higher-order shadow diagrams. 


G(w) on shadow line L;+1(G(w)), so #;41 = 7. When we insert w,; into P;_1, w; goes at 
the end of the first row of P;_1 (since it exceeds the last entry a;), and we place i in the 
corresponding cell in the first row of Q;. The statements in the theorem regarding P; and 
Q; are true in this case. Now consider the case w; < a;. Suppose a, is the smallest value 
in the first row of P;_1 exceeding w;. Then insertion of w; into P;_; bumps a, out of the 
first row. On the other hand, the point (i, w;) lies between the points (i,a,_1) and (i, a,) 
in the shadow diagram (taking ao = 0). It follows from the way the shadow lines are drawn 
that shadow line L, must drop from (i,a,) to (i,wi) when it reaches the line « = i. The 
statements of the theorem therefore hold for 7 in this case as well. O 


To analyze the rows of P(w) and Q(w) below the first row, we iterate the shadow 
diagram construction as follows. 


10.110. Definition: Iterated Shadow Diagrams. Let [),..., LD, be the shadow lines 
associated to a given subset S$ of R?. An inner corner is a point (a,b) at the top of one 
of the vertical segments of some shadow line. Let S$’ be the set of inner corners associated 
to S. The second-order shadow diagram of S is the shadow diagram associated to S’. We 
iterate this process to define all higher-order shadow diagrams of S. 


For example, taking w = 35164872, Figure 10.5 displays the second-order and third-order 
shadow diagrams for G(w). 


10.111. Theorem: Higher-Order Shadows and RSK. For w € Sy, let Ly,..., Lp 
be the shadow lines in the rth-order shadow diagram for w € S,,, with associated coor- 
dinates (71, 41),---,(@k, yx). Let Po, Pi,...,P, = P(w) and Qo,Q1,..-,Qn = Q(w) be 
the sequences of tableaux generated in the computation of RSK(w). For 0 < i < n, the 
y-coordinates of the intersections of the shadow lines with the line x = i + (1/2) are the 
entries in the rth row of P;, whereas the entries in the rth row of Q; consist of all x; < i. 
Whenever some shadow line L, has a vertical segment from (7,a) down to (i, 6), then b is 
the value bumped out of row r—1 by the insertion P; = P;_1 — w;, and b bumps the value 
a out of the cth cell in row r of P;_-1. (Take b = w; when r = 1.) 


Proof. We use induction on r > 1. The base case r = 1 was proved in 10.109. Consider r = 2 
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next. The proof of 10.109 shows that the inner corners of the first-order shadow diagram 
of w are precisely those points (i,b) such that b is bumped out of the first row of P;_1 and 
inserted into the second row of P;_; when forming P;. The argument used to prove 10.109 
can now be applied to this set of points. Whenever a point (i, b) lies above all second-order 
shadow lines approaching the line x = 7 from the left, b gets inserted in a new cell at the 
end of the second row of P; and the corresponding cell in Q; receives the label 7. Otherwise, 
if (2, b) lies between shadow lines L._1 and L, in the second-order diagram, then b bumps 
the value in the cth cell of the second row of P;_; into the third row, and shadow line L, 
moves down to level b when it reaches x = 7. The statements in the theorem (for r = 2) 
follow exactly as before by induction on 7. Iterating this argument establishes the analogous 
results for each r > 2. Oo 


10.112. Theorem: RSK and Inversion. For all w € S,, if RSK(w) = (P,Q), then 
RSK(w~!) = (Q, P). 


Proof. Consider the picture consisting of G(w) and the first-order shadow diagram. Suppose 
the shadow lines have associated x-coordinates (a1,...,@,) and y-coordinates (b),..., bx). 
Let us reflect the picture through the line y = x (which interchanges x-coordinates and 
y-coordinates). This reflection changes G(w) into G(w~*), since (a, y) € G(w) iff y = w(2) 
iff ¢ = w!(y) iff (y,x) € G(w7'). We see from the geometric definition that the shadow 
lines for w get reflected into the shadow lines for w~!. It follows from 10.109 that the first 
row of both Q(w) and P(w7?) is a1,...,@x, whereas the first row of both P(w) and Q(w~*) 
is b,,..., by. The inner corners for w~! are the reflections of the inner corners for w. So, we 
can apply the same argument to the higher-order shadow diagrams of w and w~'. It follows 
that all rows of P(w7') (resp. Q(w~')) agree with the corresponding rows of Q(w) (resp. 
P(w)). O 


10.23 Words and Tableaux 


We now generalize the RSK algorithm to operate on arbitrary words, not just permutations. 


10.113. Theorem: RSK Correspondence for Words. Let W = X” be the set of 
n-letter words over an ordered alphabet X. There is a bijection 


RSK:W— [J SSYTx(A) x SYT(). 
AEPar(n) 


We write RSK(w) = (P(w), Q(w)); P(w) is the insertion tableau for w and Q(w) is the 
recording tableau for w. For all x € X, the number of «’s in w is the same as the number of 
x’s in P(w). 


Proof. Given w = w ,w2:::Wn € W, we define sequences of tableaux Po, Pi, ..., Pp, and 
Qo, Q1, ---; Qn as follows. Po and Qo are the empty tableau. If P;_; and Q;-1 have been 
computed for some 7 < n, let P; = P;_, — w;. Suppose this insertion creates a new box 
(c, d); then we form Q; from Q;_1 by placing the value 7 in the box (c, d). By induction on i, 
we see that every P; is semistandard with values in X, every Q; is standard, and P; and Q; 
have the same shape. We set RSK(w) = (Pr, Qn). The letters in P,, (counting repetitions) 
are exactly the letters in w, so the last statement of the theorem holds. 

Next we describe the inverse algorithm. Given (P,Q) with P semistandard and Q stan- 
dard of the same shape, we construct semistandard tableaux P,,, Pn—1, ..., Po, standard 
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Insertion Tableau Recording Tableau 


insert 2: 
insert 1: 
insert 1: 
insert 3: 
insert 2: 
insert 1: 
6 
insert 3: 
6 | 
insert 1: 1215/8] 
6 | 
FIGURE 10.6 
Computation of RSK(21132131). 
tableaux Qn, Qn-1, ---; Qo, and letters wy, Wn_-1,.-..,W1 as follows. Initially, P, = P and 


Qn = Q. Suppose, for some 7 < n, that we have already constructed tableaux P; and Q; 
such that these tableaux have the same shape and consist of i boxes, P; is semistandard, 
and Q; is standard. The value i lies in a corner cell of Q;; perform uninsertion starting from 
the same cell in P; to get a smaller semistandard tableau P;_; and a letter w;. Let Q;-1 be 
Q; with the 7 erased. At the end, output the word w,w2---w,. Using 10.60 and induction, 
one checks that w = w1--+- Wp is the unique word w with RSK(w) = (P,Q). So the RSK 
algorithm is a bijection. O 


10.114. Example. Let w = 21132131. We compute RSK(w) in Figure 10.6. 


Next we investigate how the RSK algorithm is related to certain statistics on words and 
tableaux. 


10.115. Definition: Descents and Major Index for Standard Tableaux. Let Q@ bea 
standard tableau with n cells. The descent set of Q, denoted Des(Q), is the set of all k <n 
such that &+1 appears in a lower row of Q than k. The descent count of Q, denoted des(Q), 
is |Des(Q)|. The major index of Q, denoted maj(Q), is ))pepes(q) K- (Compare to 6.27, 
which gives the analogous definitions for words.) 


10.116. Example. For the standard tableau Q = Q(w) shown at the bottom of Figure 10.6, 
we have Des(Q) = {1,4,5,7}, des(Q) = 4, and maj(Q) = 17. Here, w = 21132131. Note 
that Des(w) = {1,4,5, 7}, des(w) = 4, and maj(w) = 17. This is not a coincidence. 


10.117. Theorem: RSK Preserves Descents and Major Index. Let w € X” bea 
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word with recording tableau Q = Q(w). Then Des(w) = Des(Q), des(w) = des(Q), and 
maj(w) = maj(Q). 


Proof. It suffices to prove Des(w) = Des(Q). Let Po, Pi,..., Pn and Qo, Q1,.--,Qn = Q 
be the sequences of tableaux computed when we apply the RSK algorithm to w. For each 
k <n, note that k € Des(w) iff we > we+1, whereas k € Des(Q) iff k + 1 appears in a row 
below k’s row in Q. So, for each k < n, we must prove wz > wWr+1 iff k+ 1 appears in a 
row below k’s row in Q. For this, we use the bumping comparison theorem 10.62. Consider 
the double insertion (P,_1 — wz) — wWr41-. Let the new box in Py; — wy be (i,7), and 
let the new box in (Py_1 — wz) — weyi be (r,s). By definition of the recording tableau, 
Q(i,j) =k and Q(r,s) =k +1. Now, if wy > we4i, part 2 of 10.62 says that i < r (and 
j > 8). So k+ 1 appears in a lower row than k in Q. If instead wz < we4i, part 1 of 10.62 
says that 1 >r (and j < s). Sok +1 does not appear in a lower row than & in Q. O 


Now suppose the letters 21,...,2y in X are variables in some polynomial ring. Then 
we can view a word w = w,...Wn € X” as a monomial in the x;’s by forming the product 
of all the letters appearing in w (counting repetitions). Using 2.9, we can write 


S w= (¢1+---+2¢Nn)" =par)(a1,...,2N). (10.12) 
wEex” 


We can use the RSK algorithm and 10.117 to obtain a related identity involving Schur 
polynomials. 


10.118. Theorem: Schur Expansion for Words weighted by Major Index. Let X = 
{x1,...,2n} where the x;’s are commuting indeterminates ordered by 21 < #2 <-+-: < ay. 
For every n > 1, 


S- pmaj(w) a) — S- S- tmi(@) | 5) (x1,..., 0). (10.13) 


wEexn AEPar(n) \QESYT(A) 


Proof. The left side of (10.13) is the generating function for the weighted set X”, where 
the weight of a word w is #™(“)w. On the other hand, let us weight each set SSYT x (A) 
by taking wt(P) to be the product of the entries in P. Comparing to the definition of Schur 
polynomials, we see that the generating function for this weighted set is s,(a@1,...,@N). 
Next, weight each set SYT(A) by taking wt(Q) = ¢™4(2) for Q € SYT(A). Finally, consider 
the set Ucpariny SSYTx(A) x SYT(A) weighted by setting wt(P,Q) = wt(P) wt(Q). By 
the sum and product rules for weighted sets, the generating function for this last weighted 
set is precisely the right side of (10.13). To complete the proof, note that the RSK map 
is a weight-preserving bijection between X” and U epar(n) SSYTx(A) x SYT(A), because 
of 10.113 and 10.117. O 


Setting ¢ = 1 in the previous result and using (10.12) gives the following formula for 
Pam) in terms of Schur polynomials. 


10.119. Theorem: Schur Expansion of pix). For all n, N > 1, 
par)(@1,...,2N) = y | SYT(A)|s)(a1,..., 27). 
AEPar(n) 


10.120. Remark. The RSK correspondence can also be used to find the length of the 
longest weakly increasing or strictly decreasing subsequence of a given word. For details, 
see §12.11. 
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10.24 Matrices and Tableaux 


Performing the RSK map on a word produces a pair consisting of one semistandard tableau 
and one standard tableau. We now define an RSK operation on matrices that will map each 
matrix to a pair of semistandard tableaux of the same shape. The first step is to encode 
the matrix as a certain biword. 


10.121. Definition: Biword of a Matrix. Let A = (ai;) be an M x N matrix with 
entries in N. The biword of A is a two-row array 


bw(4) = ( i ae ) 

Jl YP} eee Jk 

constructed as follows. Start with the empty biword, and scan the rows of A from top to 
bottom, reading each row from left to right. Whenever a nonzero integer a;; is encountered 


in the scan, write down a;; copies of ( ; ) at the end of the current biword. The top row 


of bw(A) is called the row word of A and denoted r(A). The bottom row of bw(A) is called 
the column word of A and denoted c(A). 


10.122. Example. Suppose A is the matrix 


The biword of A is 


10.123. Theorem: Matrices vs. Biwords. Let X be the set of all M x N matrices 
tis doe soe, BE 

a) aoe 

following conditions: (a) 11 < ig <--- < ig; (b) if i, = i541, then 7, < Jsy1; (C) ts < M 

for all s; (d) js < N for all s. The map bw: X — Y is a bijection. Suppose A = (a;;) has 

biword bw(A) as above. Then 7 appears Ye a;; times in r(A), j appears >, a;; times in 

c(A), and k =o, , ay;. 


with entries in N. Let Y be the set of all biwords w = satisfying the 


i,j 
Proof. To show that bw maps X into Y, we must show that bw(A) satisfies conditions (a) 
through (d). Condition (a) holds since we scan the rows of A from top to bottom. Condition 
(b) holds since each row is scanned from left to right. Condition (c) holds since A has M 
rows. Condition (d) holds since A has N columns. We can invert bw as follows. Given a 
biword w € Y, let A be the M x N matrix such that, for alli < M and all j < N, ai; is 
the number of indices s with 7; = 7 and j, = j. The last statements in the theorem follow 
from the way we constructed r(A) and c(A). O 


10.124. Theorem: RSK Correspondence for Biwords. Let Y be the set of biwords 
defined in 10.123. Let Z be the set of pairs (P,Q) of semistandard tableaux of the same 
shape such that P has entries in {1,2,...,N} and Q has entries in {1,2,..., M}. There is 
a bijection RSK : Y — Z that “preserves content.” This means that if (”) € Y maps to 
(P,Q) € Z, then for all i < M, v and Q contain the same number of i’s, and for all 7 < N, 
w and P contain the same number of j’s. 
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Proof. Write v = 11 < ig < +++ <td, and w = Jy, Jo,---,Je, where 7, = 7541 implies 7, < 
js+1- As in the previous RSK maps, we build sequences of insertion tableaux Po, P,,..., Px 
and recording tableaux Qo, Qi,...,Q,. Initially, Pp and Qo are empty. Having constructed 
P, and Qs, let Ps41 = P; — js4i. If the new box created by this insertion is (a,b), obtain 
Qs+1 from Q, by setting Q.41(a, b) = %s41. The final output is RSK((*)) = (Px, Qx). 

By construction, P, is semistandard with entries consisting of the letters in w, and 
the entries of Q, are the letters in v. But, is Q = Q, a semistandard tableau? To see 
that it is, note that we obtain Q by successively placing a weakly increasing sequence 
of numbers i; < ig < --- < %% into new corner boxes of an initially empty tableau. It 
follows that the rows and columns of Q weakly increase. To see that columns of @ strictly 
increase, consider what happens during the placement of a run of equal numbers into Q, 
say 7 1s ts41 tee iz. By definition of Y, we have js < jszi < ++: < je. When 
we insert this weakly increasing sequence into the P-tableau, the resulting sequence of new 
boxes forms a horizontal strip by 10.63. So, the corresponding boxes in Q (which consist of 
all the boxes labeled 7 in Q) also form a horizontal strip. This means that there are never 
two equal numbers in a given column of Q. 


The inverse algorithm reconstructs the words v and w in reverse, starting with 7, and jx. 
Given (P,Q), look for the rightmost occurrence of the largest letter in Q, which must reside 
in a corner box. Let i, be this letter. Erase this cell from Q, and perform reverse insertion 
on P starting at the same cell to recover jz. Iterate this process on the resulting smaller 
tableaux. We have i, > --- > i1 since we remove the largest letter in Q at each stage. When 
we remove a string of equal letters from Q, say i = 4% = 4-1 = +--+: = is, the associated 
letters removed from P must satisfy 7: > 74-1 > --- > js. This follows, as above, from the 
bumping comparison theorem 10.62. For instance, if 74-1 > 7:, then the new box created at 
stage t would be weakly left of the new box created at stage t — 1, which contradicts the 
requirement of choosing the rightmost i in Q when recovering 7; and 7. It follows that the 
inverse algorithm does produce a biword in Y, as required. O 


Composing the two preceding bijections gives the following result. 


10.125. Theorem: RSK Correspondence for Matrices. For every M, N > 1, there is 
a bijection between the set of M x N matrices with entries in N and the set 


LJ SSYTw(A) x SSYT(A) 
A€Par 


given by At RSK(bw(A)). If (aij) + (P,Q) under this bijection, then the number of j’s 
in P is )/, aij, and the number of 7’s in Q is )), aij. 


10.126. Example. Let us compute the pair of tableaux associated to the matrix 
A from 10.122. Looking at the biword of A, we must insert the sequence c(A) = 
(1,1,3,2,4,4,4,3) into the P-tableau, recording the entries in r(A) = (1,1,1,2,2,2, 2,3) 
in the Q-tableau. This computation appears in Figure 10.7. 


10.127. Theorem: Cauchy Identity for Schur Polynomials. For all M,N > 1, we 
have the formal power series identity in Q/[a1,...,2a7,Y1,---,YN]]: 


M oN 1 
100 peeorn ye S(Y1, +--+, YN)S8\(L1,---,Umu)- (10.14) 


ee 
i=1j=1 Yi ye Par 


Proof. We interpret each side as the generating function for a suitable set of weighted 
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Insertion Tableau Recording Tableau 
insert 1, record 1: 


insert 1, record 1: 
insert 3, record 1: 
insert 2, record 2: 


insert 4, record 2: 
insert 4, record 2: 
insert 4, record 2: 
insert 3, record 3: 


FIGURE 10.7 
Applying the RSK map to a biword. 


objects. For the left side, consider M x N matrices with entries in N. Let the weight of a 
matrix A = (a;;) be 


We can build such a matrix by choosing the entries a;; € N one at a time. For fixed 7 and 
j, the generating function for the choice of a,j is 


1 
1+ xy; + (xiy;)? cea (xiy;)* te = erat 
By the product rule for weighted sets, we see that the left side of (10.14) is the generating 
function for this set of matrices. On the other hand, the RSK bijection converts each matrix 
in this set to a pair of semistandard tableaux of the same shape. This bijection A +> (P, Q) 
will be weight-preserving provided that we weight each occurrence of 7 in P by y; and each 
occurrence of 7 in Q by a;. With these weights, the generating function for SSYT, (A) is 
8(y1,---, yn), and the generating function for SSYT yy (A) is s)(a1,...,2,). It now follows 
from the sum and product rules for weighted sets that the right side of (10.14) is the 
generating function for the weighted set U,cp,, SSYTw(A) x SSYT(A). Since RSK is a 
weight-preserving bijection, the proof is complete. O 


DT 


10.25 Cauchy Identities 


In the last section, we found a formula expressing the product [], jC —xyj)' as asum of 
products of Schur polynomials. Next, we derive other formulas for this product that involve 
other kinds of symmetric polynomials. 


430 Bijective Combinatorics 


10.128. Theorem: Cauchy Identities. For all M,N > 1, we have the formal power 
series identities: 


M N 1 
Hil— S- hy(a1,.--,€u)my(y1,---, Yn) 


i=1j=1 = us A€Parn 
= S- my(a1,---,Um)hy(yi,---, yn) 
A€Parmw 
= S- pr(a1,---,2m)pPr(y1,---, YN) 
VAN : 
A€Par 


Proof. Recall from 10.92 the product expansion 


M co 
II a (21,...,0m)t*. 


i=l k=0 


Replacing t by y;, where j is a fixed index, we obtain 


N ow 
MI a = II py he, (@1y---y 0M )Y;" 
j=1kj;=0 


Let us expand the product on the right side using the generalized distributive law (§2.1), 
suitably extended to handle infinite series within the product. We obtain 


wo 56 eS Lise d M)yji 


i=1j=1 k,=0 kn=07=1 


Let us reorganize the sum on the right side by grouping together summands indexed by se- 
quences (k1,...,ky) that can be sorted to give the same partition A. Since hz, hg, +++ hey = 
hy for all such sequences, the right side becomes 


S- hy(a1,.-.,¢m) S- yttyh? + yh, 


\€Parn (k1,...,kn ENN: 
sort(k1,....ky)=A 


The inner sum is precisely the definition of m)(yi,...,yn). So the first formula of the 
theorem is proved. The second formula follows from the same argument, interchanging the 
roles of the x’s and y’s. 

To obtain the formula involving power sums, we again start with 10.92, which can be 


written 
MN 


U 5->™ (z1,---,Zun)t”. 


n=0 


Replace the MN variables zz, iz the MN quantities x;y; (with 1<i<Mand1<j< WN). 


We obtain 
Co 
il = ee: = = 5, An(@1Y1, 21Yy2,---,emyNn)t” 


i=1 j=1 n=0 
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Now use 10.96 to rewrite the right side in terms of power sums: 
M N 


co 
10 Seer — S- pr(@1Y1, 21Y2,---,UMYN)/Zd. 
yt 


i=1j=1 n=0 = A€Par(n) 


Observe next that, for all k > 1, 


M N M N 
Pr(@iY1,---,MYn) = 5 => Seo? 
i=1 j=l i=1 j=l 


I 


M 
(st): Dy = pr(@1,02,---,0M) Pe (Yr, Yas +--+ YN)- 
i=l 


It follows from this that, for any partition \, 


pPr(@1y1,---, 2M YN) = Pr(£1, £2, --.,€M)Pr(Y1, Y25---, YN): 


We therefore find that 
M N 


wt — = Sot" 5 Pa(@1, 2, +++, CM )PA(Yr, Yas +++ YN) (10.15) 
a" J 


aN 
i=1j=1 n=0 = X€Par(n) 


Setting t = 1 gives the final formula of the theorem. O 


10.26 Dual Bases 


Now we introduce a scalar product on the vector spaces A‘,. For convenience, we assume 
that N (the number of variables) is not less than k (the degree of the polynomials in the 
space), so that the various bases of A‘, are indexed by all the integer partitions of k. 


10.129. Definition: Hall Scalar Product on A‘,. For N > k, define the Hall scalar 
product on the vector space A‘, by setting (for all y,v € Par(k)) 


(Pu,Pv) =Oif MAY, — (Pus Py) = 2p 
and extending by bilinearity. In more detail, given f,g € A‘,, choose scalars a,b, € K 
such that f = >7,,@Py and g = >7,, bypy. Then (f,9) = D0), @ubuzm € K. 
10.130. Definition: Orthonormal Bases and Dual Bases. Suppose N > k and B, = 
{fui wu © Par(k)} and Bo = {g, : w € Par(k)} are two bases of A4,. By is called an 
orthonormal basis iff (f., fv) = x(u = v) for all w,v € Par(k). By, and Bp are called dual 
bases iff (fp,9v) = Xx(u = V) for all p,v € Par(k). 

For example, taking F = C, {py/./Z: :  € Par(k)} is an orthonormal basis of A‘. 
The next theorem allows us to detect dual bases by looking at expansions of the product 
II.,; — 2eys)7?- 

10.131. Theorem: Characterization of Dual Bases. Suppose N > k and By = {fy : 
wu € Par(k)} and By = {g,,: uw € Par(k)} are two bases of AX. B, and Bz are dual bases iff 


— = S- fiultigete tao p ia: sN 


i=1j=1 k pe Par(k) 


where the left side is the coefficient of t* in the indicated product. 
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Proof. Comparing the displayed equation to (10.15), we must prove that B, and Bz are 
dual bases iff 


S- Pu(t1,---,¢n)Pu(yi,---,YN)/Zp = S- fu(ti,...,tn)gu(yi,---, yn). 


uePar(k) we Par(k) 


The idea of the proof is to convert each condition into a statement about matrices. Since 
{pu} and {py/z,} are bases of AX, there exist scalars a,y, by, € K satisfying 


fo = S- QuvPps gv = S buy (Pu /Z)- 
bh be 


Define matrices A = (a,,,) and B = (b,,,). By bilinearity, we compute (for all A,v € 
Par(k)): 


(frsQv) = oO Qy,rAPyu» ys tel 
Pe Ay,r0p,v Hes 
-> Ay,0n,v = A‘B)\v 


It follows that {fy} and {g,} are dual bases iff A‘B = I (the identity matrix of size 


| Par(k)|). 
On the other hand, writing # = (11,...,7n) and ¥ = (y1,..., yw), we have 
Yo ful@) ul) = D5 G0,ub5,uPa(@) pe (9)/28- 
we Par(k) L,a,8 


Now, one may check that the polynomials 


{Pa(X)pa(y)/ze : (a, 8) € Par(k) x Par(k)} 


are linearly independent, using the fact that the power-sum polynomials in one set 
of variables are linearly independent. It follows that the expression given above for 


De pePar(k) Su(Z)gn(¥) will be equal to dy vcpar(p) Pal®)Pal¥)/Za iff 7), @a,nb6,. = 0 for 
alla # @ and ay dae, = 1 for all a. In matrix form, these equations say that AB’ = I. 
This matrix equation is equivalent to B*A = I (since all the matrices are square), which is 
equivalent in turn to A'B = I. We saw above that this last condition holds iff B, and By 
are dual bases, so the proof is complete. oO 


10.132. Theorem: Dual Bases of A\,. For N > k, {s,(21,...,2n) : w € Par(k)} is an 
orthonormal basis of AX. Also {m,(a1,...,2N) : w © Par(k)} and {h,(a1,...,@n) 2 we 
Par(k)} are dual bases of A‘. 


Proof. In 10.127, replace every x; by ta;. Since s, is homogeneous of degree |A|, we obtain 


II = - S- s(y1,...,yn)sa(t1,...,0n)t I. 
mak iY3 


A€ Par 


Extracting the coefficient of t* gives 


tt—— <= = S- 8)(21,---,0Nn)8,(41,..-., EN). 
k 


i=1j=1 AE Par(k) 
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So 10.131 applies to show that {s, : \ € Par(k)} is an orthonormal basis. We proceed 
similarly to see that the m’s and h’s are dual, starting with 10.128. O 


10.133. Theorem: w is an Isometry. For N > k, the map w: AX, — Af, is an isometry 
relative to the Hall scalar product. In other words, for all f,g € AX, (w(f),w(g)) = (f,9)- 
Therefore, w sends an orthonormal basis (resp. dual bases) of A‘, to an orthonormal basis 
(resp. dual bases) of A‘,. 


Proof. Write f = Par GpPp and g = >_,, b,p, for suitable scalars a,,,b, € K. By linearity of 
w and bilinearity of the Hall scalar product, we compute 


(. (= ca] 7 W (= be) ) 
= Sod ayby wp), o(p,)) 
S- S- dy byen€v (Pu Pv) 


= 2 
= ) Oy dye Zp 


(w(f),“(9)) 


I 


The last step follows since we only get a nonzero scalar product when v = yw. Now, the last 
expression is 


S- Ap buzy = (f,9) - O 
Lb 


10.134. Theorem: Duality of e,,’s and fgt,,’s. For N > k, the bases {e,, : 4 € Par(k)} 
and {fgt,, : 4 € Par(k)} (forgotten basis) are dual. Moreover, 


j=1 j=l 1 — aiyyt d€Par(k) 


N N 1 
ITT] —— ] = DS an... en)fgty (yi... yn). 
k 


Proof. We know that {m,} and {h,} are dual bases. Since fgt,, = w(m,,) and e, = w(h,), 
{fgt,,} and {e,,} are dual bases. The indicated product formula now follows from 10.131. O 


Summary 


Table 10.2 summarizes information about five bases for the vector space A‘, of symmetric 
polynomials in N variables that are homogeneous of degree k. The statements about dual 
bases assume N > k. Recall that Pary(k) is the set of integer partitions of k into at 
most N parts, while Pary(k)’ is the set of partitions of k where every part is at most N. 
Table 10.3 gives formulas and recursions for expressing certain symmetric polynomials as 
linear combinations of other symmetric polynomials. Further identities of a similar kind can 
be found in the summary of Chapter 11. 


e Skew Shapes and Skew Schur Polynomials. A skew shape p/v is obtained by removing 
the diagram of v from the diagram of yw. A semistandard tableau of this shape is a filling 
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TABLE 10.2 
Bases for symmetric polynomials. 


Basis of ee Definition Dual Action of w 


Monomial {m,, : uw € Pary(k)} 


acNN: 
sort(a)=u 
Elementary {e,:  € Parn(k)’} | ex = Sree vee 
1St1 <ig<-<ip SN 
Cu = Eps Eps *** Cus 
Complete {h,, : u € Parn(k)} hy = So taee vee 
1<i1St2<---Stp oN 


or {hy : w € Parn(k)’} hi = Nig Ws Oe a, 


TESSYT N (11) 


TABLE 10.3 

Expansions and recursions for symmetric polynomials. 
Monomial expansion of Schur polys.: $s, = Oana AN Ky pMp- 
Schur expansion of complete polys.: hg = d€Par(|o|) Ky @$- 
Schur expansion of elementary polys.: €a = 2 y€Par(lal) Ey) Sy. 
Power-sum expansion of hy: hn = wePar(n) Zu Pye 
Power-sum expansion of €;: en = PS epasta) bein Die 
Schur expansion of pci»): Pain) = DERE eaten) |SYT(A)|sy. 
Monomial expansion of skew Schur polys.:  s,/) = nePanerel Kyjvp™Mp- 
Schur expansion of s,hq: Sula = doy Ky /p,a8d- 
Schur expansion of s,,€q: Sp€a = Dy Ky! /yto8d- 
Recursion linking e’s and h’s: o(—1)*eihm—i = x(m = 0). 
Recursion linking h’s and p’s: ey hipn—i = nhy. 


Recursion linking e’s and p’s: yg (—D ena = (-1)""!nen. 
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of the cells in w/v so that rows weakly increase and columns strictly increase. The skew 
Schur polynomial in N variables indexed by ju/v is 


Sy fr(Z1,+++,0n) = S- gett), 
TESSYT Nn (u/Vv) 


where the power of x; is the number of 7’s in T. Skew Schur polynomials are symmetric, 
since an involution exists that switches the frequencies of i’s and (i+1)’s in semistandard 
tableaux of this shape. 


e Orderings on Partitions. For p,v € Par(k), w <iex Y means that = vy or the first 
nonzero entry of v — ys is positive; <jex is a total ordering on Par(k). We say uw <lv (v 
dominates js) iff wy +---+ ui <1 +--+ +4 for alli > 1; < is a partial ordering on 
Par(k). We have uw <v iff v' dy’ iff w can be transformed into v by a sequence of raising 
operators (moving one box to a higher row). Also, w <v implies pu <iex Vv. 


e Kostka Numbers. For v C pp € Par and a € NN, the Kostka number Ky joe 18: the 
number of semistandard tableaux of shape y/v and content a. We have K,, = 1 for 
all \ € Par, and Ky, #0 implies w <A and pt <iex X. 


e Tableau Insertion. Given a semistandard tableau T and value x, we obtain a new semi- 
standard tableau T <— x as follows. The element x bumps the leftmost value y > x in 
the top row into the second row, and this bumping continues recursively until a value 
is placed in a new box at the end of some row. The bumping path moves weakly left 
as it goes down. Insertion is invertible if we know which corner box is the new one. If 
we insert a weakly increasing sequence into T’, the new boxes move strictly right and 
weakly higher, producing a horizontal strip. If we insert a strictly decreasing sequence 
into T, the new boxes move weakly left and strictly lower, producing a vertical strip. 


e Pieri Rules. (a) s,hp = >°,, 5, where we sum over all v such that v/p is a horizontal 
strip of size k. (b) s,ex = >_,, $, where we sum over all v such that v/p is a vertical 
strip of size k. If there are N variables, we only use shapes v with at most N parts. 


e Algebraic Independence. A list of polynomials (f1,..., fg) is algebraically independent 
over K iff the only polynomial g € K[z1,...,2%] with g(fi,..., fx) = 0 is the zero 
polynomial. Equivalently, the monomials { fp vee . : 4; € N} are linearly independent 
over kK. Polynomials f,,..., f, € K[x1,...,2,] are algebraically independent over K if 
(and only if) det ets lesen x 0. 


e Algebraically Independent Symmetric Polynomials. In the ring K[a1,...,xn], the lists 
(p1,---,DN), (hi,..-, hn), and (e1,...,en) are algebraically independent over K. So 
we can view the ring and K-vector space Ay as isomorphic to the polynomial ring 
K[z,...,2n] in three ways (an isomorphism can send z; € K[z1,...,2n] to either p; or 
h; or e;). 


e Generating Functions for e’s and h’s. We have 


Ey (t) = [[G@ +2) =) ex(a1, an )e*; 
i=1 k=0 
N N 
Hy (t) =|[G--«ait)* = So he(21, ,an)t*; 
i=l k=0 
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e Dual Bases and Cauchy Identities. Assume N > k. The Hall scalar product on Af, 
is defined by setting (p,,pv) = Zpx( = v) and extending by bilinearity. Two bases 
{fi 2 € Par(k)} and {g, : w € Par(k)} of AX, are dual relative to this inner product 
iff they satisfy the Cauchy identity 


coefficient of t® in ‘i — i a fu(ti,..-,0n)gu(y1,---,yn)- 


i=1j=1 iyyt “we Par(k) 


The Map w. The map w: Ay — Avy is defined by sending c to c (for c € K) and p; 
to (—1)*"1p;, and extending by the universal mapping property of polynomial rings. 
Note w(py) = €uPy where €, = (—1)!#-), The map w is an involution (w? = id), 
an isomorphism of rings and vector spaces, and (for all k < N) an isometry of A‘, (so 


(w(f),4(9)) = (f,9) for f,g € AK). 


RSK Correspondences. There are bijections between: (a) permutations in S;, and pairs 
(P,Q) of standard tableaux of the same shape  € Par(n); (b) words in X” and pairs 
(P,Q) where P € SSYTx(A) and Q € SYT(A) for some  € Par(n); (c) M x N matrices 
with values in N and pairs (P,Q) where P € SSYTy(A) and Q € SSYTys(A). In each 
case, one inserts successive entries into P, using Q to record the locations of new boxes. 


For (c), one must first encode the matrix as a biword. If w € S;, maps to (P, Q), then w 


-1 


maps to (Q, P). If w € X” maps to (P,Q), then Des(w) = Des(Q), des(w) = des(Q), 
and maj(w) = maj(Q), where Des(Q) is the set of k < n such that k + 1 is in a lower 
row of Q than k, des(Q) = | Des(Q)|, and maj(Q) = > J icpes(qy ?: 


Exercises 


10.135. 


Draw all skew shapes /v where p+ 6 and v + 3. Indicate which skew shapes are 


horizontal (resp. vertical) strips. 


10.136. 


Given a skew shape S CN x N, describe how to calculate the number of different 


pairs of partitions (j1,v) such that S = y/v. 


10.137. Find necessary and sufficient algebraic conditions on the parts of w and v to ensure 
that the skew shape p/v is (a) a horizontal strip; (b) a vertical strip. 


10.138. 
10.139. 
10.140. 
10.141. 
10.142. 
10.143. 
10.144. 
10.145. 


How many horizontal strips are contained in {1,2,...,a} x {1,2,...,b}? 

If |u/v| =n and |X| =k, how many tableaux with values in X have shape p1/v? 
List all the tableaux in: (a) SSYT5((3, 2)); (b) SSYT2((3, 2)); (c) SYT((3, 2, 1)). 
Give a direct counting argument to determine |SYT(js)| when js is a hook. 
Prove that | SYT(u/v)| = |SYT(y’/v’)| for all skew shapes pu/v. 

Compute s(2,2)(#1,...,£n) for N = 3,4,5 by enumerating tableaux. 

Compute s(2,2)/(1)(%1,--.,¢n) for N = 2,3,4 by enumerating tableaux. 


Find the coefficients of the following monomials in s(3 .9,1)(%1,-..,26) by enumer- 


ating tableaux: (a) 21 x9%324%5%6; (b) a}x3x3; (c) x3x3; (d) xi rergraxs; (e) vizexzr42%5; 
(f) v1 r2r3%472. 
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10.146. Let N > 4. Enumerate tableaux to confirm that the coefficients of 2 toe e 
© 050304, and x1x5a304 im $(4,3)/(1)(€1,---,€n) are all equal to 6, as claimed in 10.17. 
What happens to these coefficients if N < 4? 


10.147. For which values of N is s,,/,(@1,...,@n) = 0? 


10.148. Compute: (a) pa(a1, 22, £3); (b) e3(a1,.--, 5); (c) ha(w1, 22, U3); 
(d) ™(3,2,2) (Wis24 45a) 


10.149. For pz = (2,1), compute: (a) py(w1, v2, 23); (b) en (%1, 22, 03); (c) hy (a1, 2, 23). 


10.150. How many monomials appear with nonzero coefficient in: (a) ex(@1,...,2%n); (b) 
hx (21, ened ren ee 
10.151. How many monomials appear with nonzero coefficient in m,,(x1,...,¢%n)? 


10.152. Give a direct proof that the polynomials ex(a71,...,7n) and he(ai,...,@n) (as 
defined in 10.21 and 10.22) are symmetric. 


10.153. Find a skew shape p/v such that eg3hahzeshi = Syjv- 


10.154. Prove that any finite product of skew Schur polynomials is a skew Schur polyno- 
mial. 


10.155. Check that the set of homogeneous polynomials of degree k in R= K[a1,...,2y] 
is a vector subspace of R. Conclude that AS is a subspace of Ay for each k > 0. 


10.156. Write down an explicit basis for the K-vector space Ag. 


10.157. Compute dim(A‘,) for each choice of k and N in the range 1 < k < 6 and 
1<N<6. 


10.158. Suppose {f; : 7 € I} is a collection of nonzero polynomials in R = K[a1,...,2n] 
such that, whenever some «° appears in some f; with nonzero coefficient, the coefficient of 
x® in every other f; is zero. Prove that {f; : i € I} is linearly independent over Kk. 


10.159. Compute the following Kostka numbers: 
(a) K(3,3,2),(2,1,2,1,1,1)3 (b) K(3,2,2,1),(2,2,1,1,1,1)3 (c) K(5,5),(119); (d) K(3,3,3)/(2,1),(2,2.1,))- 


10.160. Compute the image of the first tableau in the proof of 10.33 under the maps f;, 
for 1 = 1,2,4,5,6,7,8. 


10.161. Use the maps f; in the proof of 10.33 to compute specific bijections between the 
three collections of six tableaux in 10.17. (This calculation was begun in 10.34.) 


10.162. Express the Schur polynomials s,,(v1, 22, £3, ¥4,@5) as explicit linear combinations 
of monomial symmetric polynomials, for all wF 4 and wr 5. 


10.163. Express the skew Schur polynomial s(3,3,2)/(1)(11,---,@s) as a linear combination 
of monomial symmetric polynomials. 


10.164. For all partitions  F 3, express h, and e, in terms of monomial symmetric 
polynomials by viewing h, and e, as instances of skew Schur polynomials. 


10.165. (a) Find a recursion characterizing the Kostka numbers K,/),.. (b) Use (a) to 
write a computer program for computing Kostka numbers. 


10.166. Check that <jex is a total ordering of the set Par(k), for each k > 0. 
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10.167. Prove that < is a total ordering of Par(k) iff k < 5. 


10.168. (a) List the integer partitions of 7 in lexicographic order. (b) Find all pairs u,v + 7 
such that pp <tex vy but uA v. 


10.169. (a) Find an ordered sequence of raising operators that changes ps = (5,4,2,1,1) to 
v = (7,3, 2,1). (b) How many such sequences are there? 


10.170. Prove or disprove: for all partitions u,v k, w <ex v iff 1! Sex p’. 


10.171. Let u,v + k. Can you prove that Jv implies v’ <p’ by arguing directly from the 
definitions, without using raising operators? 


10.172. Define an ordering <jex on the set N as in 10.36. Show that <j, is a total 
ordering of N¥ satisfying the following well-ordering property: there exists no infinite strictly 
decreasing sequence 


a) > lex a) >lex *** >lex a) Dlex tt? (a) E NY), 


10.173. Define the lex degree of a nonzero polynomial f(#1,...,27) € R= K[a1,...,an], 
denoted deg(f), to be the largest a € N% (relative to the lexicographic ordering defined 
in 10.172) such that x* occurs with nonzero coefficient in f. Prove that deg(gh) = deg(g) + 
deg(h) for all nonzero g,h € R, and deg(g +h) < max(deg(g), deg(h)) whenever both sides 
are defined. 


10.174. (a) Find the Kostka matrix indexed by all partitions of 4. (b) Invert this matrix, 
and thereby express the monomial symmetric polynomials m,,(%1, v2, 73,24) (for wt 4) as 
linear combinations of Schur polynomials. 


10.175. Find the Kostka matrix indexed by partitions in Pars(7), and invert it. 


10.176. Let K be the Kostka matrix indexed by all partitions of 8. How many nonzero 
entries does this matrix have? 


10.177. Suppose A is an n x n matrix with integer entries such that det(A) = +1. Prove 
that A~? has all integer entries. (In particular, this applies when A is a Kostka matrix.) 


10.178. Suppose {v; : i € I} is a basis for a finite-dimensional K-vector space V, {w; : i € 
I} is an indexed family of vectors in V, and for some total ordering < of J and some scalars 
aij € K with ay 4 0, we have w; = <i Vij U5 for alli € I. Prove that {w;: i € I} isa 
basis of V. 


10.179. Let T be the tableau in 10.53. Confirm that T — 1 and T <~ 0 are as stated in 
that example. Also, compute T <— i for i = 2,4,5,7, and verify that 10.54 holds. 


10.180. Let T be the semistandard tableau 


Compute T —i for 1<i< 9. 


10.181. Suppose we apply the tableau insertion algorithm 10.52 to a tableau T of skew 
shape. Are 10.54 and 10.55 still true? 
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10.182. Give a non-recursive description of T — x in the case where: (a) x is larger than 
every entry of T; (b) x is smaller than every entry of T. 


10.183. Let T be the tableau in 10.53. Perform reverse insertion starting at each corner 
box of T to obtain smaller tableaux T; and values x;. Verify that T; — x; = T for each 
answer. 


10.184. Let T be the tableau in 10.180. Perform reverse insertion starting at each corner 
box of T, and verify that properties 10.58(a),(b) hold in each case. 


10.185. Prove 10.58(c). 
10.186. Prove 10.59. 
10.187. Express s(4,4,3,1,1)h1 as a sum of Schur polynomials. 


10.188. Let T be the tableau in 10.53. Successively insert 1,2,2,3,5,5 into T, and verify 
that the assertions of the bumping comparison theorem hold. 


10.189. Let T be the tableau in 10.53. Successively insert 7,5, 3,2, 1 into 7, and verify that 
the assertions of the bumping comparison theorem hold. 


10.190. Let T be the tableau in 10.180. Successively insert 1,1,3,3,3,4 into 7, and verify 
that the assertions of the bumping comparison theorem hold. 


10.191. Let T be the tableau in 10.180. Successively insert 7,6,5,3,2,1 into 7’, and verify 
that the assertions of the bumping comparison theorem hold. 


10.192. Let T be the tableau in 10.61 of shape u = (5,4,4,4,1). For each shape v such 
that v/p is a horizontal strip of size 3, find a weakly increasing sequence 21 < x2 < x3 such 
that (((T — 21) — x2) — a3) has shape v, or prove that no such sequence exists. 


10.193. Repeat the previous exercise, replacing horizontal strips by vertical strips and 
weakly increasing sequences by strictly decreasing sequences. 


10.194. Prove 10.64(b). 


10.195. Let T be the tableau in 10.180. Find the unique semistandard tableau S of shape 
(7,5,4,4,1,1) and 21 < 22 < z3 < z4 such that T = S — 2202324. 


10.196. Let T be the tableau in 10.180. Find the unique semistandard tableau S of shape 
(6,6,5,3,2) and 21 > 22 > 23 > z4 such that T = S — 21222324. 


10.197. Expand each symmetric polynomial into sums of Schur polynomials: (a) (4,3,1)e2; 
(b) 8(2,2)h33 (€) 8(2,2,1,1,1)h43 (d) $(3,3,2)€3- 

10.198. Use the Pieri rule to find the Schur expansions of h(3.2.1), 2(3,1,2), 2(2,2,3), and 
h1,3,2), and verify that the answers agree with those found in 10.67. 


10.199. Expand each symmetric polynomial into sums of Schur polynomials: (a) h(2,2,2); 
(b) hs,3)3 (c) 83,2) 2(2,1)3 (d) §(6,3,2,2)/(3,2)- 


10.200. Find the coefficients of the following Schur polynomials in the Schur expansion of 
h3,2,2,1,1)! (a) $(9)3 (b) 8(5,4)3 (€) 8(4,4,1); (4) $(2,2,2,2,1)3 (€) 8(3,3,3)3 (£) $(3,2,2,1,1)- 


10.201. Use 10.73 to compute the monomial expansions of hy (21, 72,23, 4) for all parti- 
tions ys of size at most four. 
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10.202. Let a = (aj,...,a@ 5). Prove that the coefficient of m)(#1,...,2y) in the monomial 
expansion of ha(x1,...,2n) is the number of s x N matrices A with entries in N such that 
wp Ali, f) = a4 for 1 <i <8 and 7_, A(é,j) =A; for <j <N. 


10.203. Use the Pieri rules to compute the Schur expansions of: (a) €(3,3,1); (b) €(5,3)3 (©) 
$(3,2)€(2,1)3 (d) §(4,3,3,3,1,1)/(3,1,1,1)- 


10.204. Find the coefficients of the following Schur polynomials in the Schur expansion of 
€(4,3,2,1): (a) §(4,3,2,1)3 (b) $(5,5)3 (c) 5(2,2,2,2,2)5 (d) $(2,2,2,14)3 (e) S(110). 


10.205. Use 10.78 to express €(2,2,1) and €(3,2) as linear combinations of monomial sym- 
metric polynomials. 


10.206. Prove the formula for s,,e¢q stated in Table 10.3. 


10.207. Let a = (aj,...,a@s5). Find a combinatorial interpretation for the coefficient of 
m)(a1,-..,@~) in the monomial expansion of eg(x1,...,0%N) in terms of certain s x N 
matrices (cf. 10.202). 


10.208. Prove that the following lists of polynomials are algebraically dependent by ex- 
hibiting an explicit dependence relation: (a) hi(a1, 22) for 1 <i < 3; (b) e;(#1, v2, 23) for 
1<i< 4; (c) pi(a1, 22,23) for 1 <i <4. 


10.209. Prove that any sublist of an algebraically independent list is algebraically inde- 
pendent. 


10.210. Suppose a = (a,...,ay) € N* is a partition. Show that 
deg(ef? °7e$?- 9 --- eSNG Xe) = a (see 10.173 for the definition of lex degree). 


10.211. Algorithmic Proof of the Fundamental Theorem of Symmetric Polyno- 
mials. Prove that the following algorithm will express any f € Aw as a polynomial in the 
elementary symmetric polynomials e;(11,...,¢N) (where 1 <i < N) in finitely many steps. 
If f = 0, use the zero polynomial. Otherwise, let the term of largest degree in f (see 10.173) 
be cx™ where c € K is nonzero. Use symmetry of f to show that a, > a2 >--- > ay, and 
that f — cef?~ eg? --- ef G1 OX eS is either 0 or has degree 3 <jcex a (see 10.210). 
Continue similarly to express this new polynomial (and hence f) as a polynomial in the 
e;'s. 


10.212. Use the algorithm in the preceding exercise to express m2,1)(@1, 72,73, %4) and 
p3(@1, 2, 03,04) as a polynomial in {e;(71, 72,73, 04): 1<i< 4}. 


10.213. Use the test in 10.83 to verify that the polynomials h;(a1, 72,73) for 1 <i< 3 are 
algebraically independent. Can you generalize this computation to more than 3 variables? 


10.214. Use the test in 10.83 to verify that the polynomials e;(21,272,73,x74) for 1 < 
i < 4 are algebraically independent. Can you generalize this computation to more than 4 
variables? 


10.215. Compute the images of 


and | 4 


under the involution J in the proof of 10.87. 
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10.216. Write out all the matched pairs (z,I(z)) in the proof of 10.87 when: (a) N = 2 
and m = 3; (b) N =3 and m= 2. 


10.217. Imitate the argument in the proof of 10.88 to show that algebraic independence of 


(hy,...,hy) in K[a1,...,2N] implies algebraic independence of (e1,..., ey). 
10.218. (a) Prove the recursion ex(%1,...,UN) = ex (@1,.--,2nN-1) teK-1(21,---,UN-1)2N 
for k, N > 1. What are the initial conditions? (b) Find a similar recursion for hy(a1,...,2@N). 


10.219. (a) Prove s’(n,k) = en_x(1,2,...,2—1). (b) Prove S(n,k) = hy—x(1, 2,...,). 
10.220. Prove 10.91 by expanding ipa es —rj;) using the generalized distributive law. 


10.221. Consider the polynomial p = 2° — 2x4 + 5a° + 7x? — x — 4, which has five 
roots r1,...,75 € C. Compute: (a) the sum of the roots; (b) the product of the roots; 
(c) eg(r1,---,175); (d) the sum of the squares of the roots; (e) 0 j4, Tee. 


10.222. Use 10.92 to calculate the coefficient of ¢* in the multiplicative inverse of (1 — 
2x)(1 — 3x)(1 — 52). 


10.223. Let A be an nxn complex matrix. What is the relationship between the coefficients 
of the characteristic polynomial det(tI — A) and the eigenvalues r),...,17%, of A? 


10.224. Use (10.8) to show that p;(a1,...,2~) (for 1 <i < N) are algebraically indepen- 
dent over K iff hj(a1,...,an) (for 1 <i < N) are algebraically independent over K. 


10.225. Use (10.9) to show that p;(a1,...,2n) (for 1 < i < N) are algebraically indepen- 
dent iff e;(a1,...,u~) (for 1 <i < N) are algebraically independent. 


10.226. Consider the maps f and g from the proof of 10.93. Compute 


f((5, 2[4]4]5]15}, (4]4))) and o( [12 [2] 476 [6)). 


10.227. Consider the maps J and g from the proof of 10.94. Compute 


I ( 13 |, ’ I 3, 13 |, ’ I 3, 13], ’ I 3, 13 ¢ 
For any objects that are fixed points of J, compute the images of those objects under g. 


N Li 
t=1 1-ait 


10.228. Write > in terms of suitable symmetric polynomials. 


10.229. Obtain 10.93 and 10.94 algebraically by differentiating the generating functions 
Hy (t) and En (-t). 


10.230. Use the recursions 10.93 and 10.94 to verify the formulas for h4, 3!e3, and 4!e4 
stated in 10.95. 


10.231. Complete the proof of 10.96 by checking that g(f(yo)) = yo and f(g(zo)) = 20, 
and, in general, go f = idy and fog = idx. 


10.232. Let g be the map in the proof of 10.96. Compute g(z1) and g(z2), where 


g 
Pe (i eae ae a ee eae ee eB A Beeb Gh 8 
MS | Pe OR AA A Hh OG Ge OT Ps Lt 2 3B BB? 
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10.233. Let f be the map in the proof of 10.96. Compute f(y), where 


y= (2,2,2.2),2,5)08,8)04,6)(1,7), ( 5 oe sa ee ane 


10.234. Let I be the involution in the proof of 10.98. Compute I(z1), [(z2), and I(f(y)), 
where 21, z2, and y are the objects given in the preceding two exercises. 


10.235. Let A be an n x n complex matrix with eigenvalues r1,...,17n. (a) Show that 
the trace of A, defined by tr(A) = )7y_, A(i,i), is pi(ri,..-,7n). (b) For k > 1, express 


tr(A*) as a function of r1,...,Tn. (c) Suppose n = 5 and (tr(A*) : k = 1,2,...,5) = 
(3, 41, —93, 693, —2957). Find the characteristic polynomial of A. 


10.236. Compute: (a) w(h3); (b) w(p(3,2,1,1)); (¢) w(eca,a)); (d) w(5(5,3,3,1,1,1))- 


10.237. Show that there exists a unique automorphism of the ring and K-vector space Aj 
mapping each c € K to itself and sending each p; to —p;. Compute the image of h,, and of 
€n under this automorphism. 


10.238. In the proof of 10.101(b), where is the assumption n < N needed? 

10.239. Compute the polynomials fet, (21, 22,23) for all partitions of size at most 3. 
10.240. Compute RSK(w) for all w € Ss. 

10.241. Compute RSK~'(P, Q) for all pairs P,Q of standard tableaux of shape (2, 2). 


10.242. Let w = 41572863 € Ss. Compute RSK(w) and RSK(w~'). Verify that 10.112 
holds in this case. 


10.243. Consider the pair of standard tableaux 


[1] 3|6] 
P=[2/4[8] Q=[3]/6[7} 
[4] 8] 
Compute w = RSK~'(P,Q) and v = RSK~'(Q,P), and verify that 10.112 holds in this 


case. 


10.244. (a) Verify that 10.109 holds for the example w = 35164872 by comparing the first 
rows of the tableaux in Figure 10.1 with the shadow diagram in Figure 10.4. (b) Similarly, 
verify the assertions in 10.111 using Figure 10.5. 


10.245. Draw all the shadow diagrams for the permutations w and w~! in 10.242, and use 
them to verify the assertions in 10.111 for this example. 


10.246. Draw all the shadow diagrams for the permutations w and v in 10.2438, and use 
them to verify the assertions in 10.111 for this example. 


10.247. (a) Point out why n! = >>),,, |SYT(A)|?. (b) Verify this identity directly for n = 5. 
10.248. Show that the number of w € S, such that w? = id is given by >), |SYT(A)|. 
10.249. Compute RSK(w) for all words w € {0, 1}°. 


10.250. Compute RSK(313211231), and verify that 10.117 holds in this case. 
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10.251. Compute the word w such that 
[1 [2/4] 6] 
RSK(w) = [[2[3] 44}, [3[5 [8 [1d } . 
[7191 
co 


10.252. (a) Compute )oresy1((4,1)) qni(T), (b) Compute Ve resyT((3,2,1)) gmail 


10.253. Express pi,4) as a linear combination of Schur polynomials. 


Verify that 10.117 holds. 


10.254. (a) Compute the biword and the pair of tableaux associated to the matrix 
1 0 2 
0 1 1 |. (b) Do the same for the transpose of this matrix. 
3 2 0 


10.255. (a) Compute the matrix and pair of tableaux associated to the biword 
1 12.2 23 3° 5 
2411 3 3 3 2~'i+7° 
(b) Do the same for the biword obtained by switching the two rows and sorting the new 


top row into increasing order (using the values in the bottom row to break ties). 


10.256. (a) Compute the biword and matrix associated to the pair of tableaux 


(b) Do the same for the pair of tableaux (Q, P). 


10.257. Show that if a matrix A maps to (P,Q) under the RSK correspondence, then 
the transposed matrix At maps to (Q, P) under RSK. Do this by generalizing the shadow 
constructions in $10.22, allowing more than one dot to occupy a given point (7,7) in the 
graph. 


10.258. Give a rigorous justification of the computation 


N ow co love) N 
II SS he, (a1,...,eM)yh = ye vee S- [J ta (er... car)yj, 


j=l kj=0 ki=0 ky =0j=1 

which was used in the proof of 10.128. 

10.259. Verify the fact (used in the proof of 10.131) that the polynomials 
{Pa()pa(¥)/ze : (a, 8) € Par(k) x Par(k)} 

are linearly independent. 


10.260. Suppose A and B are n x n matrices with entries in K such that AB = I. Prove 
that BA = I. 


10.261. Suppose { f,, : 4 € Par(k)} is an orthonormal basis of AX,, and g € A‘,. Prove that 


A S- (95 Fuh Fu- 


pe Par(k) 
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10.262. Quasisymmetric Polynomials. A polynomial f € K[x,,...,xy] is called qua- 
sisymmetric iff for all compositions a = (a1,...,@,) with s < N andalll <i, <ig<-+-< 
is < N, the coefficient of []j_, ae! in f equals the coefficient of []5_, v5’ in f. For each such 
a, define the monomial quasisymmetric polynomial Ma, = Pic. cue en TTj=1 oe (a) 
Show that the quasisymmetric polynomials form a subspace of K[x1,...,2N] with basis 
{Mz}. (b) What is the dimension of the space of homogeneous quasisymmetric polynomials 
of degree k? (c) Show that every symmetric polynomial is quasisymmetric. More specifically, 
express each m, in terms of the Mq’s. 


10.263. Fundamental Quasisymmetric Polynomials. For each n > 0 and each subset 
S of {1,2,...,2— 1}, define Qn.s(1,...,UN) = 9) i, Vi, +++ Us, where we sum over all 
sequences 1 < 21 < ig < ++» <i, < N such that 7 € S implies i; < ij41. Qn,g is called a 
fundamental quasisymmetric polynomial. (a) Show that Qn.g = hn. What is Qnf1,2,....n—1}? 
(b) Show that Qn,s is quasisymmetric (as defined in 10.262). (c) Use inclusion-exclusion to 
express M, as a linear combination of Qn,s5’s and vice versa. Use this to find a basis for the 


space of quasisymmetric polynomials consisting of suitable Q’s. 


10.264. Q-Expansion of Schur Polynomials. For each integer partition \ of n, prove 
that s)(a1,...,0N) = Dvesyt(a) QnDes(U)(41,---,2N), Where Qn,g is defined in 10.263. 


Notes 


Macdonald’s book [89] contains a comprehensive treatment of symmetric polynomials, with 
a heavy emphasis on algebraic methods. A more combinatorial development is given by 
Stanley [127, Chpt. 7]; see the references to that chapter for an extensive bibliography 
of the literature in this area. Two other relevant references are Fulton [46], which treats 
tableaux and their connections to representation theory and geometry, and Sagan [121], 
which explains the role of symmetric polynomials in the representation theory of symmetric 
groups. 

The bijective proof of 10.33 is due to Bender and Knuth [9]. The algorithmic proof of the 
existence part of the fundamental theorem of symmetric polynomials (outlined in 10.211) 
is usually attributed to Waring [130]. Some of the seminal papers by Robinson, Schensted, 
and Knuth on what is now called the RSK correspondence are [79, 116, 122]. The symme- 
try property 10.112 was first proved by Schiitzenberger [124], but the combinatorial proof 
using shadow lines is due to Viennot [135]. Quasisymmetric polynomials (see 10.262) were 
introduced by Gessel [51]. 
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Abaci and Antisymmetric Polynomials 


In the last chapter, we used combinatorial operations on tableaux to establish algebraic 
properties of Schur polynomials and symmetric polynomials. This chapter investigates the 
interplay between the combinatorics of abaci and the algebraic properties of antisymmetric 
polynomials. These concepts will be used to establish additional facts about integer parti- 
tions and symmetric polynomials. In particular, we will derive some formulas for expanding 
skew Schur polynomials in terms of various bases. 


DT 


11.1 Abaci and Integer Partitions 


An abacus is an instrument used in ancient times for performing arithmetical calculations. 
The abacus consists of one or more runners that contain sliding beads. The following com- 
binatorial object gives a mathematical model of an abacus. 


11.1. Definition: One-Runner Abacus. An abacus with one runner is a function w : 
Z — {0,1} such that for some m,n, w; = 1 for all i < m and w; = 0 for all i > n. We think 
of w as an infinite word - --w_2w_1woW1W2W3 °°: that begins with an infinite string of 1’s 
and ends with an infinite string of 0’s. Each 1 is called a bead, and each 0 is called a gap. 

Let Abc denote the set of all one-runner abaci. An abacus w is called justified at position 
m iff w; = 1 for alli < m and w; = 0 for all i > m. Intuitively, an abacus is justified iff 
all the beads have been pushed to the left as far as they will go. The weight of an abacus 
w, denoted wt(w), is the number of pairs i < j with w; < w; (or equivalently, w; = 0 and 
Wi = 1). 


11.2. Example. Here is a picture of a one-runner abacus: 


Vf YS 
0000+00+-+/00+0+0+++ 


This picture corresponds to the mathematical abacus 


w =--+111101100110101000--- , 


where the underlined 1 is wo. All positions to the left of the displayed region contain beads, 
and all positions to the right contain gaps. 

Consider the actions required to transform w into a justified abacus. We begin with the 
bead following the leftmost gap, which slides one position to the left, producing 


w’ =---111110100110101000:--. 
The next bead now slides into the position vacated by the previous bead, producing 


w” =---111111000110101000--- . 
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The next bead moves 3 positions to the left to give the abacus 
w) = ---111111100010101000---. 


In the next three steps, the remaining beads move left by 3, 4, and 5 positions, respectively, 
leading to the abacus 
w* =---111111111100000000--- , 


which is justified at position 0. If we list the number of positions that each bead moved, 
we obtain a weakly increasing sequence: 1 < 1 <3 < 3 < 4 < 5. This sequence can be 
identified with the integer partition A = (5,4,3,3,1,1). Observe that wt(w) = 17 = |A|. 
This example generalizes as follows. 


11.3. Theorem: Partitions vs. Abaci. Justification of abaci defines a bijection J : 
Abc > Z x Par with inverse U : Z x Par — Abe. If J(w) = (m, A), then wt(w) = |AJ. 


Proof. Given an abacus w, let n be the least integer with w, = 0 (the position of the 
leftmost gap), which exists since w begins with an infinite string of 1’s. Since w ends with 
an infinite string of 0’s, there are only finitely many j > n with w; = 1; let these indices 
be ji < jo <--++ < je, where n < ji. We justify the abacus by moving the bead at position 
ji left Ax = j1 — n places. Then we move the bead at position je left A:-1 = jo — (n + 1) 
places. (We subtract n + 1 since the leftmost gap is now at position n + 1.) In general, 
at stage k we move the bead at position j, left Ar41-~ = jx — (1 +k — 1) places. After 
moving all t beads, we will have a justified abacus with the leftmost gap located at position 
n+t. Since n < 1 < jo < +++ < je, it follows that 0 < Ay < AA < ++: < AL. We define 
J(w) = (n+t-—1,A) where \ = (Aj,..., Az). For all k, moving the bead at position j, left 
At+1—-k places decreases the weight of the abacus by 441%. Since a justified abacus has 
weight zero, it follows that the weight of the original abacus is precisely X, +---+ Az = |AI. 

J is a bijection because “unjustification” is a two-sided inverse for J. More precisely, 
given (m, js) € Z x Par, we create an abacus U(m, 4) as follows. Start with an abacus 
justified at position m. Move the rightmost bead to the right 441 places, then move the next 
bead to the right ~2 places, and so on. This process reverses the action of J. O 


11.4. Remark: Computing U. The unjustification map U can also be computed using 
partition diagrams. We can reconstruct the bead-gap sequence in the abacus U(m, jp) by 
traversing the frontier of the diagram of z (traveling northeast) and recording a gap (0) for 
each horizontal step and a bead (1) for each vertical step. For example, if 4 = (5, 4,3,3,1,1), 
the diagram of ju is 


and the bead-gap sequence is 01100110101. To obtain the abacus w, we prepend an infinite 
string of 1’s, append an infinite string of zeroes, and finally use m to determine which 
symbol in the resulting string is considered to be wo. One readily checks that this procedure 
produces the same abacus as the map U in the previous proof. We can also confirm that 
the map U is weight-preserving via the following bijection between the cells of the diagram 
of yu and the pairs 7 < 7 with w; = 0 and w,; = 1. Starting at a cell c, travel south to reach 
a horizontal edge on the frontier (encoded by some w; = 0). Travel east from c to reach a 
vertical edge on the frontier (encoded by some w; = 1 with j > i). For example, the cell in 
the second row and third column of the diagram above corresponds to the marked gap-bead 
pair in the associated abacus: 
---01100110101---. 
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ne 


11.2 Jacobi Triple Product Identity 


The Jacobi triple product identity is a partition identity that has several applications in 
combinatorics and number theory. We can give a bijective proof of this identity by using 
cleverly chosen weights on abaci. 


11.5. Theorem: Jacobi Triple Product Identity. The following equation holds in the 
ring Q(w)[[q]]: 


meZ n>1 n>0 n>1 


Proof. Since the formal power series [],,.,(1 — q”) is invertible, it suffices to prove the 
equivalent identity ~ 


Senet 2yn TT A = [Ja tua") T] + we"). (11.1) 


1—q"” 
meZ n>1 n>1 n>0 


Let the weight of an integer m be wt(m) = q’™("+))/2u™, and let the weight of a partition 
ube q'#!. Since [s-l La") = ern q'#| by 8.17, the left side of (11.1) is 


S>_wt(m) wt(u), 


(m,u)EZx Par 


which is the generating function for the weighted set Z x Par. 

On the other hand, let us define new weights on the set Abc as follows. Given an abacus 
w, let N(w) = {i < 0: w; = 0} be the set of nonpositive positions in w not containing a 
bead, and let P(w) = {i > 0: w; = 1} be the set of positive positions in w containing a 
bead. Both N(w) and P(w) are finite sets. Define 


wt(w)= [J (wt) TT ja). 


iE N(w) i€ P(w) 


We can build an abacus by choosing a bead or a gap in each nonpositive position (choosing 
“bead” all but finitely many times), and then choosing a bead or a gap in each positive 
position (choosing “gap” all but finitely many times). The generating function for the choice 
at position i < 0 is 1+u7'q!‘l, while the generating function for the choice at position i > 0 
is 1+ u'q’. By the product rule for weighted sets (see 8.9), the right side of (11.1) is 


Dowe abe wt(w). 
To complete the proof, it suffices to argue that the justification bijection J : Abc > 
Z x Par is weight-preserving. Suppose J(w) = (m,y) for some abacus w. The map J 


converts w to an abacus w*, justified at position m, by || steps in which some bead 
moves one position to the left. Claim 1: The weight of the justified abacus w* is wt(m) = 
urgm(™+)/2 We prove this by considering three cases. When m = 0, N(w*) = 0 = P(w*), 
so wt(w*) = 1 = wt(0). When m > 0, N(w*) =@ and P(w*) = {1,2,...,m}, so 


wt(w*) “2 gmagitete te “ar gman > wt(m). 


When m < 0, N(w*) = {0,-1, -2,...,-(|m| — 1)} and P(w*) = @, so 


wt(w*) = gg lg oe ee ela) =) = “rg Cal _ qn tl)? wes wt(m). 
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Claim 2: If we move one bead one step left in a given abacus y, the u-weight stays 
the same and the q-weight drops by 1. Let 7 be the initial position of the moved bead, 
and let y’ be the abacus obtained by moving the bead to position 7 — 1. If i > 1, then 
N(y’) = N(y) and P(y’) = (Ply) ~ {t}) U {4 — 1}, so wt(y’) = wt(y)/q as desired. If i < 0, 
then P(y’) = P(y) and N(y’) = (N(y) ~ {i — 1}) U {i} (since the N-set records positions 
of gaps), and so wt(y’) = wt(y)q!!/q!*—!l = wt(y)/q. If i = 1, then P(y’) = P(y) ~ {1} and 
N(y’) = N(y) ~ {0}, so the total u-weight is preserved and the q-weight still drops by 1. 
Finally, combining the two claims gives 


wt(w) = wt(w*)q'“! = wt(m) wt(y) = wt(J(w)). Oo 


Variations of the preceding proof can be used to establish other partition identities. As 
an example, we now sketch a bijective proof of Euler’s pentagonal number theorem. Unlike 
our earlier proof in 88.7, the current proof does not use an involution to cancel oppositely 
signed objects. We remark that Euler’s identity also follows by suitably specializing the 
Jacobi triple product identity. 


11.6. Euler’s Pentagonal Number Theorem. In Qj[q]], we have 
TIa- 9) = do-aytal 3. 
n=1 keZ 

Proof. Note first that 


[[@-@ =[[a-¢) [Ja-¢ [Ja-¢”). 


n=1 i>1 i>1 i>1 


It therefore suffices to prove the identity 


i— pos Dye i 2_ 
][a-¢ ‘)[Ja-¢ 2) = So (-1) Fg" "ae S- (—1)F g3lal+(3e k)/2 


i>1 i>1 keZ i>l (kp) €ZX Par 


(11.2) 
Consider abaci w = {w3%41 : k € Z} whose positions are indexed by integers congruent 
to 1 mod 3. Define N(w) = {i < 0: %=1 (mod 3),w; = 0} and P(w) = {i >O:t=1 
(mod 3), w; = 1}. Let sgn(w) = (—1)IN@I+IP) and wt(w) = EN (w)UP(w) |i]. We can 
compute the generating function >>, sen(w)q”*“) in two ways. On one hand, placing a bead 
or a gap in each negative position and each positive position leads to the double product 
on the left side of (11.2). On the other hand, justifying the abacus transforms w into a pair 
(3k — 2, u) for some k € Z. As in the proof of the Jacobi triple product identity, one checks 
that the justified abacus associated to a given integer k has signed weight (—1)Fg(3k?—*)/2, 
while each of the |u| bead moves in the justification process reduces the g-weight by 3 and 
preserves the sign. So the right side of (11.2) is also the generating function for these abaci, 
completing the proof. O 


DS 


11.3. Ribbons and k-Cores 


Recall the following fact about division of integers: given integers a > 0 and k > 0, there 
exist a unique quotient g and remainder r satisfying a = kq+rand0<r<_k. Our next 
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goal is to develop an analogous operation for dividing an integer partition jz by a positive 
integer k. The result of this operation will consist of k “quotient partitions” together with 
a “remainder partition” with special properties. We begin by describing the calculation of 
the remainder, which is called a k-core. Abaci will then be used to establish the uniqueness 
of the remainder, and this will lead us to the definition of the & quotient partitions. 

To motivate our construction, consider the following pictorial method for performing 
integer division. Suppose we wish to divide a = 17 by k = 5, obtaining quotient q = 3 and 
remainder r = 2. To find these answers geometrically, first draw a row of 17 boxes: 


iS RSS] Seer FIs) 


Now, starting at the right end, repeatedly remove strings of five consecutive cells until this 
is no longer possible. We depict this process by placing an 7 in every cell removed at stage 
i, and writing a star in any leftover cells: 


The quotient q is the number of 5-cell blocks we removed (here 3), and the remainder r is 
the number of leftover cells (here 2). This geometric procedure corresponds to the algebraic 
process of subtracting k from a repeatedly until a remainder less than k is reached. For the 
purposes of partition division, we now introduce a two-dimensional version of this strip- 
removal process. 


11.7. Definition: Ribbons. A ribbon is a skew shape that can be formed by starting at a 
given square, repeatedly moving left or down one step at a time, and including all squares 
visited in this way. A ribbon consisting of k cells is called a k-ribbon. A border ribbon of 
a partition y is a ribbon R contained in dg(j) such that dg(u) ~ R is also a partition 
diagram. 


11.8. Example. Here are two examples of ribbons: 


(6,6, 4,3)/(5, 3,2) = F (7,4, 4, 4)/(3, 3,3, 2) = 


The first ribbon is a 9-ribbon and a border ribbon of (6,6, 4,3). The partition (4,3, 1) with 
diagram 


has exactly eight border ribbons, four of which begin at the cell (1,4). 


11.9. Definition: k-cores. Let k be a positive integer. An integer partition v is called a 
k-core iff no border ribbon of v is a k-ribbon. 


For example, (4,3, 1) is a 5-core, but not a k-core for any k < 5. 

Suppose p is any partition and k is a positive integer. If 4, has no border ribbons of size k, 
then js is a k-core. Otherwise, we can pick one such ribbon and remove it from the diagram 
of ys to obtain a smaller partition diagram. We can iterate this process, repeatedly removing 
a border k-ribbon from the current partition diagram until this is no longer possible. Since 
the number of cells decreases at each step, the process will eventually terminate. The final 
partition v (which may be empty) must be a k-core. This partition is the “remainder” when 
Lt is divided by k. 
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11.10. Example. Consider the partition w = (5,5, 4,3) with diagram 


Let us divide uw by k = 4. We record the removal of border 4-ribbons by entering an 7 in 
each square that is removed at stage i. Any leftover squares at the end are marked by a 
star. One possible removal sequence is the following: 


Another possible sequence is: 


Notice that the three 4-ribbons removed were different, but the final 4-core was the same, 
namely v = (4,1). 


We want to show that the k-core obtained when dividing y: by k depends only on yu and 
k, not on the choice of which border k-ribbon is removed at each stage. We now use abaci 
to prove this result. 


11.11. Definition: Abacus with k Runners. A k-runner abacus is an ordered k-tuple 
of abaci. The set of all such objects is denoted Abc". 


11.12. Theorem: Decimation of Abaci. For each k > 1, there are mutually inverse 
bijections D;, : Abc > Abc” (decimation) and I, : Abc* — Abc (interleaving). 


Proof. Given w = (w; : i € Z) € Abe, set D,(w) = (w®, wt,..., w*71), where 
w" = (Waktr :q € Z) (O0<r<k). 


Thus, the abacus w’ is obtained by reading every kth symbol in the original abacus (in 
both directions), starting at position r. It is routine to check that each w” is an abacus. 
The inverse map interleaves these abaci to reconstruct the original one-runner abacus. More 


precisely, given v = (v°, uv", reer me e Abe*, let Ty, (v) = z where Zak+r = UG for allqg EZ 
and 0 <r <k. One readily checks that I,(v) is an abacus and that D, and I, are two-sided 
inverses. O 


By computing D;(U(-—1, )), we can convert any partition into a k-runner abacus. We 
now show that moving one bead left one step on a k-runner abacus corresponds to removing 
a border k-ribbon from the associated partition diagram. 


11.13. Theorem: Bead Motion vs. Ribbon Removal. Suppose a partition pz is en- 
coded by a k-runner abacus w = (w°,w',...,w*7!). Suppose that v is a k-runner abacus 
obtained from w by changing one substring ...01... to ...10... in some w’. Then the 
partition v associated to v can be obtained by removing one border k-ribbon from jy. More- 
over, there is a bijection between the set of removable border k-ribbons in 4s and the set of 


occurrences of the substring 01 in the components of w. 
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Proof. Recall from 11.4 that we can encode the frontier of a partition 4 by writing a 0 
(gap) for each horizontal step and writing a 1 (bead) for each vertical step. The word so 
obtained (when preceded by 1’s and followed by 0’s) is the 1-runner abacus associated to 
this partition, and w is the k-decimation of this abacus. 

Let R be a border k-ribbon of yw. The southeast border of R, which is part of the 
frontier of 44, gets encoded as a string of k + 1 symbols ro9,11,...,7%, where r9 = 0 and 
rp = 1. For instance, the first ribbon in 11.8 has southeast border 0001010011. Note that 
the northwest border of this ribbon is encoded by 1001010010, which is the string obtained 
by interchanging the initial 0 and the terminal 1 in the original string. The following picture 
suggests why this property holds for general k-ribbons. 
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Since ro = 0 and rx, = 1 are separated by & positions in the 1-runner abacus, these two 
symbols map to two consecutive symbols 01 on one of the runners in the k-runner abacus 
for yu. Changing these symbols to 10 will interchange ro and rz, in the original word. Hence, 
the portion of the frontier of j. consisting of the southeast border of R gets replaced by the 
northwest border of R. So, this bead motion transforms py into the partition v obtained by 
removing the ribbon R. 

Conversely, each substring 01 in the k-runner abacus for js corresponds to a unique pair 
of symbols 0---1 in the 1-runner abacus that are k positions apart. This pair corresponds 
to a unique pair of steps H...V on the frontier that are k steps apart. Finally, this pair 
of steps corresponds to a unique removable border k-ribbon of . So, the map from these 
ribbons to occurrences of 01 on the runners of w is a bijection. O 


11.14. Example. Let us convert the partition 4 = (5,5,4,3) from 11.10 to a 4-runner 
abacus. First, the 1-runner abacus U(—1, 1) is 


-++111000101011000---. 


Decimating by 4 produces the following 4-runner abacus: 


Note that the bead-gap pattern in this abacus can be read directly from the frontier of wu 
by filling in the runners one column at a time, working from left to right. For the purposes 
of ribbon removal, one may decide arbitrarily where to place the gap corresponding to the 
first step of the frontier; this decision determines the integer m in the expression U(m, ,.). 

Now let us start removing ribbons. Suppose we push the rightmost bead on the top 
runner left one position, producing the following abacus: 
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Reading down columns to recover the frontier of the new partition, we obtain the partition 
vy = (4,3,3,3). We get v from y by removing one border 4-ribbon, as shown. 


Finally, we push the rightmost bead on the second runner left one position to get the 
following abacus: 


The associated partition is (4,1), as shown here: 


At this point, all runners on the abacus are justified, so no further bead motion is possible. 
This reflects the fact that we can remove no further border 4-ribbons from the 4-core (4, 1). 

Now return to the original partition y: and the associated 4-runner abacus. Suppose we 
start by moving the bead on the second runner left one position, producing the following 
abacus: 
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This corresponds to removing a different border 4-ribbon from p: 


Observe that sz has exactly two removable border 4-ribbons, whereas the 4-runner abacus 
for 4 has exactly two movable beads, in accordance with the last assertion of 11.13. 


11.15. Example. Consider the following 3-runner abacus: 


We count six beads on this abacus that can be moved one position left without bumping 
into another bead. Accordingly, we expect the associated partition to have exactly six 
removable border 3-ribbons. This is indeed the case, as shown below (we have marked the 
southwestmost cell of each removable ribbon with an asterisk): 


Now we can prove that the k-core obtained from a partition ~ by repeated removal of 
border ribbons is uniquely determined by pz and k. 


11.16. Theorem: Uniqueness of k-cores. Suppose yu is an integer partition and k > 1 is 
an integer. There is exactly one k-core p obtainable from ps by repeatedly removing border 
k-ribbons. We call p the k-core of us. 


Proof. Let w be a fixed k-runner abacus associated to js (say w = D;,(U(-1, )) for definite- 
ness). As we have seen, a particular sequence of ribbon-removal operations on jz corresponds 
to a particular sequence of bead motions on w. The operations on yu terminate when we 
reach a k-core, whereas the corresponding operations on w terminate when the beads on 
all runners of w have been justified. Now p is uniquely determined by the justified k-runner 
abacus by applying J; and then J. The key observation is that the justified abacus obtained 
from w does not depend on the order in which individual bead moves were made. Thus, the 
k-core p does not depend on the order in which border ribbons are removed from p. O 
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11.17. Example. The theorem shows that we can calculate the k-core of ww by jus- 
tifying any k-runner abacus associated to y. For example, consider the partition p = 
(10, 10, 10, 8,8,8, 7,4) from 11.15. Justifying the 3-runner abacus in that example produces 
the following abacus: 


OOO 


We find that the 3-core of y is (1, 1). 


11.4 k-Quotients and Hooks 


Each runner of a k-runner abacus can be regarded as a one-runner abacus, which corresponds 
(under the justification bijection J) to an element of Z x Par. This observation leads to the 
definition of the k-quotients of a partition. 


11.18. Definition: k-quotients of a partition. Let yz bea partition and k > 1 an integer. 
Consider the k-runner abacus (w®°, w!,...,w*>!) = Dy(U(-1, p)). Write J(w*) = (mi, v") 
for 0 < i < k. The partitions appearing in the k-tuple (v°,v1,...,v*~!) are called the 
k-quotients of ps. 


11.19. Example. Let uw = (5,5,4,3). In 11.14, we computed the 4-runner abacus 
D,(U(-1, »)): 


Justifying each runner and converting the resulting 4-runner abacus back to a partition 
produces the 4-core of 4, namely (4,1). On the other hand, converting each runner to a 
separate partition produces the 4-tuple of 4-quotients of , namely: 


((2), (1), (0), (0). 


11.20. Example. Consider the partition 4 = (10,10, 10,8,8,8,7,4) from 11.15. We com- 
pute 
U(—1, «) =---111100001000101110011100---. 


Decimation by 3 produces the 3-runner abacus shown here: 
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Justifying each runner shows that the 3-core of w is p = (1,1). On the other hand, by 
regarding each runner separately as a partition, we obtain the 3-tuple of 3-quotients of pu: 


(py) = (38,252), (4, 4) (3 2.1)): 


Observe that |u| = 65 = 2+3-(7+8+6) = |p| + 3|y°| + 3|v"| + 3)v?|. 

Now consider what would have happened if we had performed similar computations on 
the 3-runner abacus for jz displayed in 11.15, which is D3(U(0, )). The 3-core coming from 
this abacus is still (1,1), but converting each runner to a partition produces the following 
3-tuple: 


((3, 2,1), (3, 2, 2), (4, 4)). 


This 3-tuple arises by cyclically shifting the previous 3-tuple one step to the right. One 


can check that this holds in general: if the k-quotients for p are (v°,...,v*~!), then the 


k-quotients computed using D,(U(m, 1) will be (vE~”’,...,v*-1,v°,v!,...), where m! is 


the integer remainder when m+ 1 is divided by k. 


11.21. Remark. Here is a way to compute (w°,w!,...,w*~!') = D,(U(-1,)) from the 
frontier of 4 without writing down the intermediate abacus U(—1, 4). Draw a line of slope 
—1 starting at the northwest corner of the diagram of yw. The first step on the frontier of 
pw lying northeast of this line corresponds to position 0 of the zeroth runner w?. The next 
step is position 0 on w!, and so on. The step just southwest of the diagonal line is position 
—1 on w*~!, the previous step is position —1 on w*~?, and so on. To see that this works, 
one must check that the first step northeast of the diagonal line gets mapped to position 0 
on the one-runner abacus U(—1, 1); we leave this as an exercise. 


11.22. Theorem: Partition Division. Let Core(k) be the set of all k-cores. There is a 
bijection 

A, : Par > Core(k) x Par® 
such that A, (yu) = (p,v°,...,v*"~+), where p is the k-core of jz and the v* are the k-quotients 
of u. We have 


k-1 
lel = lal +k Do We" l. 
1=0 


Proof. The function A; is well-defined and maps into the stated codomain. To see that this 
function is a bijection, we describe its inverse. Given (p,v°,...,v*—!) € Core(k) x Par", first 
compute the k-runner abacus (w®,...,w*7!) = D,(U(—1, p)). Each wu is itself a justified 
one-runner abacus because p is a k-core; say w’ is justified at position m;. Now replace each 
w' by v' = U(m;,). Finally, let be the unique partition satisfying J(I,(v°,...,v*7!)) = 
(—1, w). This construction reverses the one used to produce k-cores and k-quotients, so pu is 
the unique partition mapped to (p,v°,...,v*~!) by Ag. 

To prove the formula for |ju|, consider the bead movements used to justify the runners 
of the k-runner abacus D;,(U(—1, )). On one hand, every time we move a bead one step 
left on this abacus, the area of w drops by k since the bead motion removes one border 
k-ribbon. When we finish moving all the beads, we are left with the k-core p. It follows 
that |u| = |p| + km where m is the total number of bead motions on all k runners. On the 
other hand, for 0 <i < k, let m; be the number of times we move a bead one step left on 
runner i. Then m = mp + m1 +--+: +mx_1, whereas m; = |p| by 11.3. Substituting these 
expressions into |j1| = |p| + km gives the desired formula. Oo 


We close our discussion of partition division by describing a way to compute the k- 
quotients of yz directly from the diagram of yz, without recourse to abaci. We will need the 
following device for labeling cells of dg(j) and steps on the frontier of 4 by integers in 
{0,1,...,k—1}. 
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FIGURE 11.1 
Content and 3-content of cells and steps. 


11.23. Definition: Content and k-Content. Consider a partition diagram for 4, drawn 
with the longest row on top. Introduce a coordinate system so that the northwest corner of 
the diagram is (0,0) and (7,7) is located i steps south and j steps east of the origin. The 
content of the point (7,7) is c(t,7) = 7 — 7%. The content of a cell in the diagram of ju is the 
content of its southeast corner. The content of a frontier step from (7,7) to (i, 7 +1) is j —7. 
The content of a frontier step from (i,7) to (¢— 1,7) is 7 — 7. If z is a point, cell, or step 
in the diagram, then the k-content c,(z) is the unique value r € {0,1,...,4 —1} such that 
c(z) =r (mod k). 


11.24. Example. The left side of Figure 11.1 shows the diagram of the partition uw = 
(10,10, 10, 8,8, 8, 7,4) with each cell and frontier step labeled by its content. On the right 
side of the figure, each cell and step is labeled by its 3-content. Given a cell in the diagram 
of 4, we obtain an associated pair of steps on the frontier of 4 by traveling south (resp. 
east) from the cell in question. Suppose we mark all cells whose associated steps both have 
content zero. Then erase all other cells and shift the marked cells up and left as far as 
possible. The following diagram results: 


This partition (3,2,2) is precisely the zeroth 3-quotient of yu. Similarly, marking the cells 
whose associated steps both have 3-content equal to 1 produces the next 3-quotient of pu: 
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Finally, marking the cells whose associated steps both have 3-content equal to 2 produces 
the last 3-quotient of pu: 


RESREeeeee 
eel Pel 
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In general, to obtain the ith k-quotient v* of 4 from the diagram of ju, label each row 
(resp. column) of the diagram with the k-content of the frontier step located in that row 
(resp. column). Erase all rows and columns not labeled i. The number of cells remaining in 
the jth unerased row is the jth part of v’. To see why this works, recall that the cells of 
v’ correspond bijectively to the pairs of symbols 0---1 on the ith runner of the k-runner 
abacus for yw. In turn, these pairs correspond to pairs of symbols w, = 0, w; = 1 on the 
one-runner abacus for 4 where s < t and s = i = t (mod k). The symbols in positions 
congruent to i mod k come from the steps on the frontier of 4 whose k-content is 7. Finally, 
the relevant pairs of steps on the frontier correspond to the unerased cells in the construction 
described above. Composing all these bijections, we see that the cells of v’ are in one-to-one 
correspondence with the unerased cells of the construction. Furthermore, cells in row 7 of 
v’ are mapped onto the unerased cells in the jth unerased row of yw. It follows that the 
construction at the beginning of this paragraph does indeed produce the k-quotient v’. 


DS 


11.5 Antisymmetric Polynomials 


We now define antisymmetric polynomials, which form a vector space analogous to the 
space of symmetric polynomials studied in the last chapter. 


11.25. Definition: Antisymmetric Polynomials. Let K be a field containing Q. A 
polynomial f € K[a,...,2N] is called antisymmetric iff for all w € Sy, 


F (fw 1)» Pw(2)> AES ,Lw(N)) = sgn(w) f (1, 22, aa ,tN). 


11.26. Remark. The group Sy acts on the set {%1,...,@7n} via wea; = (i) for w € Sw 
and 1 < i < N. This action extends (by the universal mapping property of polynomial 
rings) to an action of Sy on K[a1,...,an] such that we f = f(y(1),---,%ucny) for 
wé Sy and f € K[a1,...,x2n]. The polynomial f is antisymmetric iff we f = sgn(w)f 
for all w € Sy. It suffices to check this condition when w is a basic transposition (7,7 + 1). 
For, any w € Sy can be written as a product of basic transpositions w = t)tg---t,. By 
hypothesis, t; e (+f) = sgn(ti)(4f) = +f for all, so 


wef=tle---e(t,ef)= (—1)* f = sen(w)f. 


So f € K[x1,...,2n] is antisymmetric iff 


f(@1,--+,%i41, 2i,---,0n) = —f (v1, ..-, 0, €i41,---,2N) for alli < N. 
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11.27. Example. The polynomial f(z,,...,2y) = Thicjcren (&; — xx) is antisymmetric. 
To check this, consider what happens to the factors in the product when we interchange 
x; and x;41. Factors not involving x; or 2;41 are unchanged; factors of the form (x; — rx) 
with k > 7+ 1 get interchanged with factors of the form (x;41 — x%); and factors of the 
form (a; — x;) with 7 < i get interchanged with factors of the form (x; — 241). Finally, 
the factor (a; — xj41) becomes (aj;41 — xj) = —(a; — vi41). Thus, (i,i +1) ¢ f = —f for all 
i < N, proving antisymmetry of f. 


We remark that the polynomial f = [],-,(vj; — vx) in the previous example is the 
Vandermonde determinant 


N 
det ||aj"Ih<iscw = D> sen(w) [Tony 
i=l 


weSn 


(see §12.9 for a combinatorial proof of this assertion). We can use analogous determinants 
to manufacture additional examples of antisymmetric polynomials. 


11.28. Definition: Monomial Antisymmetric Polynomials. Let = (ui > pe > 
-++ > pn) be a strictly decreasing sequence of N nonnegative integers. Define a polynomial 
Gp(1,-..,2n) by the formula 


N 
Q,y(21,...,2N) = det ||r5"||1<ij<n = es sen(w) [J iy: 


weSn i=1 
We call a, a monomial antisymmetric polynomial indexed by ps. 


To see that a, really is antisymmetric, note that interchanging x, and xp41 has the 
effect of interchanging columns k and k + 1 in the determinant defining a,. By 9.47, this 
column switch will change the sign of a,, as required. 


11.29. Example. Let N = 3 and p = (5,4,2). Then 


Gy (£1, £2, 03) = teresa? + ofaees + a2adah — oleoes — wiade§ — otege3. 

As the previous example shows, a,,(%1,...,%y) is a sum of N! distinct monomials 
obtained by rearranging the subscripts (or equivalently, the exponents) in the monomial 
vi eh? .-- ac. Each monomial appears in the sum with sign +1 or —1, where the sign of 
x{' --- a" depends on the parity of the number of basic transpositions needed to transform 
the sequence (e1,...,e) to the sorted sequence (f11,..., 4). It follows from these remarks 
that a, is a nonzero homogeneous polynomial of degree |u| = “1 +---+ pn. 


11.30. Definition: 6(N). For each N > 1, let 6(N) = (N —1,N - 2,...,2,1,0). 


The strictly decreasing sequences = (U1 > [lg > ++: > fn) correspond bijectively to 
the weakly decreasing sequences \ = (Ay > Ap > ++: > Aw) via the maps p> pw — d(N) 
and A+> \+ 6(N). It follows that each polynomial a, can be written a)+5(v) for a unique 
partition A € Pary. This indexing scheme will be used frequently below. Note that when 
A = (0,..-,0), we have p = 6(N) and agwyy = []y<jenen (ji — @e) (See 11.27). Observe 


that a5(1) is a homogeneous polynomial of degree N(.N — 1)/2 = as 
11.31. Definition: Spaces of Antisymmetric Polynomials. For a given field K, let 


An be the set of all antisymmetric polynomials in K[r1,...,2n]. Let A, consist of those 
polynomials in Ay that are homogeneous of degree n, together with the zero polynomial. 
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One readily verifies that Ay is a vector subspace of K[r1,...,ay], and each Af, is a 
subspace of Ay. We now exhibit bases for these vector spaces involving monomial antisym- 
metric polynomials. We use the notation Pary(n) to denote the set of all partitions of n 
into N distinct nonnegative parts, and Par{, = Un>(¥) Par4(n). 


11.32. Theorem: Monomial Basis for Aj,. Assume K is a field containing Q. Ifn < (2); 
then Ah, = {0}. Ifn > eae then 
{ay : 1 € Pard(n)} = {aayacny 1 € Paty (n — N(N —1)/2)} 
is a basis of the K-vector space A},. Hence, the collection 
fay: we Pary} = {4,45(N) : A € Pary } 
is a basis of Ay. 


Proof. Suppose e = (e€1,...,en) is any exponent sequence, f € Ay is an arbitrary an- 
tisymmetric polynomial, and w € Sy. Let c be the coefficient of c° = a}'---cXY in f, 
so 


f =caxf---aQY + other terms. 


Acting by w, we see that 


sen(w)f=wef = CXL) = NUN) + other terms 
Be at 


ey,-1 
= C2, ty ‘“’ + other terms 
W*E 


= Cx + other terms, 


where w * € = (€w-1(1),++-,€w-1(N))- In other words, writing f|,« for the coefficient of «* 
in f, we have fl,w«e = sgn(w)(flxe). 

Let us apply this fact to an exponent sequence e such that e; = e; for some i # j. Let 
w = (i,7), so that w * e = e and sgn(w) = —1. It follows that c = —c, so 2c = 0. Because 
K contains Q, we deduce c = 0 in K. This means that no antisymmetric polynomial 
contains any monomial with a repeated value in its exponent vector. In particular, the 
smallest possible degree of a monomial that can appear with nonzero coefficient in any 
antisymmetric polynomial is0+1+2+---+(N-1)= (as This proves the first assertion 
of the theorem. 

For the second assertion, recall that 1 +> + 6(N) is a bijection from Pary(n — (4)) 


to Par4,(n). So we need only show that {a, : 4 € Par%,(n)} is a basis for AX. To show 
that this set spans A‘, fix f € AX. By the previous paragraph, we can write f = )>, cov 
where we sum over all sequences (a1,...,ay) € N* with distinct entries summing to n, 
and each c, lies in kK. We claim 


pS S- Cy Gy. 


vé€Pard,(n) 


To prove this, we check the coefficient of x® on each side. Choose pu € Par4,(n) and w € Sy 
such that w* ~ = a (~ consists of the entries of a sorted into decreasing order). By the first 
paragraph of the proof, 


flee = flown = sgn(w)(flen) = sgn(w)ey. 


On the other side, a,|.« = 0 for all vy 4 pw (since no rearrangement of v equals a). For v = p, 
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antisymmetry gives a,,|;« = sgn(w)(a,|2u) = sgn(w). Multiplying by c, and summing over 
all v, the coefficient of #* in )7, cya, is sgn(w)c,, as desired. 
To prove linear independence, suppose 


0= So da (d,€K). 


vePar4 (n) 


For a fixed w € Par{(n), a, is the only polynomial among the a,’s that involves the 
monomial x“. Extracting this coefficient on both sides of the given equation, we find that 
0=d,-1=d,. Since p was arbitrary, all d,,’s are zero. oO 


The next result explains the relationship between the various vector spaces AX, and A’,. 


11.33. Theorem: Symmetric vs. Antisymmetric Polynomials. For each k > 0, the 


N 
vector spaces A‘, and Ak+(2) are isomorphic, as are the vector spaces Ay and Ay. In 
each of these cases, an isomorphism is given by the formula M(f) = f - asi) for f € An, 
and the inverse isomorphism sends g € Ay to g/as,n). In particular, every antisymmetric 
polynomial in N variables is divisible by the polynomial a5:y). 


Proof. Fix k > 0, and consider the map M = M;, : Ak, — K[a1,...,an] defined by 
M(f) = f+ asc) for f € A4,. Note, first, that f is homogeneous of degree k and agin) is 
homogeneous of degree Ge) so M(f) is homogeneous of degree k + ees Second, M(f) is 
antisymmetric, since for any w € Sw, 


we (fasiwy) = (we f)- (we aswyy) = f - (sgn(w)as(y)) = sgn(w)(fascny)- 


N 
So the map M takes values in the space Akt ? ) Third, one immediately verifies that M is a 
K-linear map. Fourth, the kernel of this linear map is zero: M(f) = 0 implies f -a5(y) = 0, 
which implies f = 0 since a5,y) is a nonzero element of the integral domain K’[x1,..., xy]. 
So M is injective. Fifth, M@ must also be surjective since its domain and codomain are vector 
spaces having the same finite dimension |Pary(k)|. So each M;, is an isomorphism. Since 


N 
Aw (resp. An) is the direct sum of subspaces A‘, (resp. Ab+(2)) it follows that Ay and 
Ay are isomorphic as well. Finally, surjectivity of the map f +> fas(y) means that every 
antisymmetric polynomial g has the form fas;y) for some symmetric polynomial f. So g is 
divisible by asi) in K[21,..., xy). Oo 


11.34. Remark. Suppose we apply the inverse of the isomorphism M,; to the basis 
N 
{4,45(N) : A € Pary(k)} of ARH). We will obtain a basis {a,+45(~)/as5(u) : A € Parn(k)} of 


AX,. It turns out that 4+5(N)/45(N) is none other than the Schur polynomial s)(71,...,2n)! 
To prove this fact and other properties of antisymmetric polynomials, we will use the labeled 
abaci introduced below. 


11.6 Labeled Abaci 


Given « = (1 > W2 > ++: > pn), recall that the monomial antisymmetric polynomial 
indexed by yw is defined by 
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nN 
a 
a 
NX 
(o'-) 


es 
3: 
: 
ie 8 
6 


04123 4 5 6 7 #8 0123 4 5 6 8 
Jn De OOo ee Jn De @Oe ee 
FIGURE 11.2 


Labeled abaci. 


The next definition introduces a set of signed, weighted combinatorial objects to model this 
formula. 


11.35. Definition: Labeled Abaci. A labeled abacus with N beads is a word v = (vu; : 
i > 0) such that each letter 1,...,.N appears exactly once in v, and all other letters of v 
are zero. We think of the indices 7 as positions on an abacus containing one runner that 
extends to infinity in the positive direction. When v; = 0, there is a gap at position 7 on the 
abacus; when v; = j > 0, there is a bead labeled 7 at position 7. The weight of the abacus 


v is 
wt(v) = II eae 
i: Ui >0 
So if bead 7 is located at position 7, this bead contributes a factor of xi to the weight. 

We can encode a labeled abacus by specifying the positions occupied by the beads and 
the ordering of the bead labels. Formally, define pos(v) = (t#1 > ~u2 > ++: > wn) to be the 
indices i such that v; > 0. Then define w(v) = (Up,,---,Uux) € Sn. We define the sign of 
v to be the sign of the permutation w(v), which is (—1)"v’™)), Let LAbe be the set of all 
labeled abaci, and for each p € Par4,, let 


LAbc(w) = {v € LAbc : pos(v) = p}. 


For each ps € Par, there is a bijection between LAbc(j) and Sy given by v > w(v). 
Furthermore, an abacus v € LAbc(j2) has sign sgn(w(v)) and weight i eae Civ) s" So 


N 
S- sgn(v) wt(v) = S- sen(w) [Lene Gel tigaxsgitnel 


vELAbc(p) weSn 


11.36. Example. Let N = 3 and v = (5,4, 2). Earlier, we computed 


ay (21, 22,03) = +ai aba? + ofoted + cieded — oiegne — oi aeas — 22 asad. 
The six terms in this polynomial come from the six labeled abaci in LAbc(v) shown in 
Figure 11.2. Observe that we read labels from right to left in v to obtain the permutation 
w(v). This is necessary so that the “leading term” x7! --- ax’ will correspond to the identity 
permutation and have a positive sign. 
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Informally, we justify a labeled abacus v € LAbc(y) by moving all beads to the left as far 
as they will go. This produces a justified labeled abacus J(v) = (wy,...,W2,W1,0,0,...) € 
LAbc(d(N)), where (wy,...,w1) = w(v). To recover v from J(v), first write = 4+ d(N) 
for some  € Pary. Move the rightmost bead (labeled w 1) to the right 1 positions from 
position N — 1 to position N — 1+ A; = py. Then move the next bead (labeled w2) to the 
right Az positions from position N — 2 to position N — 2+ A2 = pe, and so on. 


11.7 Pieri Rule for p;, 


The product of an antisymmetric polynomial and a symmetric polynomial is an antisymmet- 
ric polynomial (see 11.96), which can be written as a linear combination of the monomial 
antisymmetric polynomials. In the next few sections, we will derive several “Pieri rules” 
for expressing a product @)+5(1)g (where g is symmetric) in terms of the a,’s. We begin 


by considering the case where g = pz(21,...,0N) = aaa x is a power-sum symmetric 
polynomial. 
We know a)+5(w)(“1,-..,£N) is asum of signed terms, each of which represents a labeled 


abacus with beads in positions given by p = A+ 6(N). If we multiply some term in this 
sum by «*, what happens to the associated abacus? Recalling that the power of 2; tells us 
where bead 7 is located, we see that this multiplication should move bead 7 to the right k 
positions. This bead motion occurs all at once, not one step at a time, so bead 7 is allowed 
to “jump over” any beads between its original position and its destination. However, there 
is a problem if the new position for bead 7 already contains a bead. In the proofs below, we 
will see that two objects of opposite sign cancel whenever a bead collision like this occurs. If 
there is no collision, the motion of bead 7 will produce a new labeled abacus whose x;-weight 
has increased by k. However, the sign of the new abacus (compared to the original) depends 
on the parity of the number of beads that bead 7 “jumps over” when it moves to its new 
position. 

To visualize these ideas more conveniently, we decimate our labeled abacus to obtain a 
labeled abacus with k runners. Formally, the k-decimation of the labeled abacus v = (v, : 
j = 0) € LAbe(A+4(N)) is the k-tuple (v°, v',...,v*~1), where vf = ugrtr- Moving a bead 
from position j to position 7 + k on the original abacus corresponds to moving a bead one 
position along its runner on the k-runner abacus. If there is already a bead in position 7 +k, 
we say that this bead move causes a bead collision. Otherwise, the bead motion produces a 
new labeled abacus in LAbc(v + 6(N)), for some v € Pary. By ignoring the labels in the 
decimated abacus, we see that v arises from \ by adding one k-ribbon at the border. The 
shape of this ribbon determines the sign change caused by the bead move, as illustrated in 
the following example. 


11.37. Example. Take N = 6, k = 4, A = (3,3,2,0,0,0), and pw = A+ 6(6) = 
(8, 7,5,2,1,0). Consider the following labeled abacus v in LAbc(,1): 


0123 4 5 6 7 8 9 
4)(1)-0—-e(5)-0 (6 )(3)-e 


This abacus has weight x779r$alr2xj and sign sen(3,6,5,1,4,2) = (—1)!° = +1. Decima- 


tion by 4 produces the following 4-runner abacus: 
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Suppose we move bead 1 four positions to the right in the original abacus, from position 2 
to position 6: 


0123 4 5 67 8 9 


Qo OVOG) 


The new abacus has weight x$a9x8arja226 = wt(v)at and sign sgn(3, 6,1,5,4,2) = (—1)9 = 
—1. The change in weight arose since bead 1 moved 4 positions to the right. The change in 
sign arose since bead 1 passed one other bead (bead 5) to reach its new position, and one 
basic transposition is needed to transform the permutation 3,6,5,1,4,2 into 3,6,1,5,4, 2. 
The decimation of the new abacus looks like: 


This abacus is in LAbc(v) = LAbc(a@ + 6(6)), where v = (8,7,6,5,1,0) and a = 
(3, 3, 3,3, 0,0). Compare the diagrams of the partitions and a: 


We obtain a from A by adding a new border 4-ribbon. To go from » to a, we change 
part of the frontier of A from NEENE (where the first N step corresponds to bead 1) to 
EEENN (where the last N step corresponds to bead 1). There is one other N in this string, 
corresponding to the one bead (labeled 5) that bead 1 passes when it moves to position 6. 
Thus the number of passed beads (which is 1, here) is one less than the number of rows 
occupied by the new border ribbon (which is 2, here). 

Let us return to the original abacus v and move bead 5 four positions, from position 5 
to position 9: 


0123 4 5 67 8 9 


4)-(1)-0-e—e—@-(6)(3)(5) 
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This abacus has weight x77928xj}a22% = wt(v)ad and sign sgn(5, 3,6, 1,4, 2) = (—1)'° = +1. 
Note that the sign is unchanged since two basic transpositions are required to change the 


permutation 3,6,5,1,4,2 into 5,3,6,1,4,2. The decimation of the new abacus is: 


= (2 

r=0 {2 )-e+(3 )-e—e 
r=1 |(4)-0-G )-e—e 
r=2 (1 )-e—_e— e—e 
r=3 | e-(6)-e—e—e 


This abacus lies in LAbc(@ + 6(6)) where @ = (4,4,4,0,0,0). The diagram of ( arises by 
adding a border 4-ribbon to the diagram of A: 


=H p- 


This time the frontier changed from ...NENNE...(where the first N is bead 5) to 
... EENNN... (where the last N is bead 5). The moved bead passed two other beads (beads 
3 and 6), which is one less than the number of rows in the new ribbon (three). In general, 
the number of passed beads is one less than the number of N’s in the frontier substring 
associated to the added ribbon, which is one less than the number of rows in the added 
ribbon. 

Finally, consider what would happen if we tried to move bead 4 (in the original abacus) 
four positions to the right. A bead collision occurs with bead 5, so this move is impossible. 
Now consider the labeled abacus v’ obtained by interchanging the labels 4 and 5 in v: 


0123 4 5 6 7 8 9 
5 )-(1)-0—-e(4)-0 (6 )(3)-© 


Moving bead 5 four positions to the right in v’ causes a bead collision with bead 4. Notice 
that sgn(v’) = — sgn(v) since [3, 6, 4, 1,5, 2] = (4, 5)o[3, 6,5, 1, 4, 2]. Also note that wt(v)a} = 
wt(v’)aé; this equality is valid precisely because of the bead collisions. The abaci v and v’ 
are examples of a matched pair of oppositely signed objects that will cancel in the proof of 
the Pieri rule, given below. 


The observations in the last example motivate the following definition. 


11.38. Definition: Spin and Sign of Ribbons. ‘The spin of a ribbon R, denoted 
spin(R), is one less than the number of rows occupied by the ribbon. The sign of R is 
sgn(R) = (-1)(®), 


We now have all the combinatorial ingredients needed to prove the Pieri rule for multi- 
plication by a power-sum polynomial. 


11.39. Theorem: Antisymmetric Pieri Rule for p;. For all \ € Pary and all k > 1, 
the following identity holds in K[x1,..., ay]: 


€46(N)(21,---,0N)PE(#1,-..,2N) = S- sen(R)ag+5(n)(@1,---,£N). 


BEeParn: 
B/X is a k-ribbon R 
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Proof. Let X be the set of pairs (v,7), where v € LAbc(A+d(N)) and 1 <i < N. For (v,7) € 
X, set sgn(v,i) = sgn(v) and wt(v,i) = wt(v)aF. Then ay45.vyPe = dove x sgn(z) wt(z). 
We introduce a weight-preserving, sign-reversing involution J on X. Given (v,7) in X, try to 
move bead 7 to the right & positions in v. If this move causes a bead collision with bead J, let 
v’ be v with beads i and j switched, and set I(v,i) = (v’, 7). Otherwise, set I(v,7) = (v, 7). 
One verifies that J is an involution. 

Consider the case where I(v,i) = (v’,j) 4 (v,7). Since the label permutation w(v’) is 
obtained from w(v) by multiplying by the basic transposition (é, j) sgn(v’, 7) = sgn(v’) = 


—sgn(v) = —sgn(v,i). The weight of v must have the form x% ea tres - because of the bead 


collision, so wt(v’) = agag tt. --. It follows that wt(v, i) = wt(v)a* = wt(v')at = wt(v’, 7). 
Thus, I is a weight-preserving, sign-reversing map. 

Now consider a fixed point (v,i) of I. Let v* be the abacus obtained from v by moving 
bead i to the right k positions, so wt(v*) = wt(v)2* = wt(v, 7). Since the unlabeled k-runner 
abacus for v* arises from the unlabeled k-runner abacus for v by moving one bead one step 
along its runner, it follows that v* € LAbc(@+46(N)) for a unique 6 € Pary such that R = 
3/2 is a k-ribbon. As argued earlier, sgn(v*) differs from sen(v) by sgn(R) = (—1)?"(®), 
which is the number of beads that bead 7 passes over when it moves. Conversely, any abacus 
y counted by ag45,1v) (for some shape as above) arises from a unique fixed point (v,7) € X, 
since the moved bead 7 is uniquely determined by the shapes \ and (, and v is determined 
from y by moving the bead 7 back k& positions. These remarks show that the sum appearing 
on the right side of the theorem is the generating function for the fixed point set of J, which 
completes the proof. O 


11.40. Example. When N = 6, we calculate 


Q(3,3,2)+6(6)P4 = —4(3,3,3,3)+6(6) + @(4,4,4)+6(6) — 2(6,4,2)+6(6) 1+ 2(7,3,2)+6(6) + 4(3,3,2,2,1,1)+5(6) 


by adding border 4-ribbons to the shape (3,3,2), as shown here: 


Observe that the last shape pictured does not contribute to the sum because it has more 
than N parts. An antisymmetric polynomial indexed by this shape would appear for N > 7. 


11.8 Pieri Rule for e; 


Next we derive Pieri rules for calculating a)45(~yex and @y+5(y)he- Our starting point is 
the following expression for the elementary symmetric polynomial ex: 


en(@1,---,2N) = > I :- 


sath 2,...,N} JES 
S|=k 
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Let S = {j1,...,jx} be a fixed k-element subset of {1,2,...,N}. Then Hyes 2 = 
Lj, Lj,***L;, is a typical term in the polynomial e;,. On the other hand, a typical term 
in @)45(N) corresponds to a signed, weighted abacus v. Let us investigate what happens to 
the abacus when we multiply such a term by 2j, ---2%j,. 

Since the power of x; indicates which position bead j occupies, multiplication by 
Xj,°++x;, Should cause each of the beads labeled j1,...,j% to move one position to the 
right. We execute this action by scanning the positions of v from right to left. Whenever we 
see a bead labeled 7 for some 7 € S, we move this bead one step to the right, thus multi- 
plying the weight by x;. Bead collisions may occur, which will lead to object cancellations 
in the proof below. In the case where no bead collisions happen, we obtain a new abacus 
v* € a,45(n)- The beads on this abacus occur in the same order as on v, so w(v*) = w(v) 
and sgn(v*) = sgn(v). Recalling that the parts of A (resp. v) count the number of bead 
moves needed to justify the beads in v (resp. v*), it follows that v € Pary is a partition 
obtained from A € Pary by adding 1 to k distinct parts of A. This means that the skew 
shape v/A is a vertical strip of size k. 


11.41. Example. Let N = 6 and \ = (3,3, 2, 2). Let v be the following abacus in LAbc(A+ 
6(6)): 


0123 4 5 67 8 9 


1)-e—e-(3)-(2)-0-(4)(6)-e 


Suppose k = 3 and S = {1,2,3}. We move bead 2, then bead 3, then bead 1 one step right 
on the abacus. No bead collision occurs, and we get the following abacus: 


0123 4 5 67 8 9 


2 D-2+ OOOO 


This abacus lies in LAbc(v + 6(6)), where v = (3,3,3,3, 1). Drawing the diagrams, we see 
that v arises from 4 by adding a vertical 3-strip: 


Suppose instead that S = {1, 2,6}. This time we obtain the abacus 


0123 4 5 67 8 9 


5)-@(1)-0(3)-e-(2)(4)-# (6) 


which is in LAbc((4, 3, 3, 2, 1) + 6(6)). Now the partition diagrams look like this: 
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However, suppose we start with the subset S = {3,5,6}. When we move bead 6, then bead 
3, then bead 5 on the abacus v, bead 3 collides with bead 2. We can match the pair (v, 5) 
to (w,T), where T = {2,5,6} and w is this abacus: 


0123 4 5 67 8 9 


1)-e—e-(2)-(3)-0-(4)(6)-e 


Observe that sgn(w) = —sgn(v) and wt(v)1325%6 = wt(w)r2752%6. This example illustrates 
the cancellation idea used in the proof below. 


11.42. Theorem: Antisymmetric Pieri Rule for e,. For all \ € Pary and all k > 1, 
the following identity holds in K[x1,..., an]: 


@45(N)(21,---, EN )ex(#1,--.,2N) = ) 4g45(N)(@1,---,£N). 
BeParn: 
B/d is a vertical k-strip 


Proof. Let X be the set of pairs (v,S) where v € LAbc(A + 6(N)) and S is a k-element 
subset of {1,2,..., N}. Letting sgn(v, S) = sgn(v) and wt(v, S) = wt(v) []j¢5%;, we have 


Ad+45(N)Ck = S- segn(z) wt(z). 
zEx 


Define an involution I : X — X as follows. Given (v, S') € X, scan the abacus v from right 
to left and move each bead in S one step to the right. If this can be done with no bead 
collisions, we obtain an abacus v* counted by the sum on the right side of the theorem, such 
that sgn(v) = sgn(v*) and wt(v, S) = wt(v*). In this case, (v, S) is a fixed point of J, and 
the bead motion rule defines a sign-preserving, weight-preserving bijection between these 
fixed points and the abaci counted by the right side of the theorem. 

Now suppose a bead collision does occur. Then for some 7 € S and some k ¢ S, bead k 
lies one step to the right of bead 7 in v. Take j to be the rightmost bead in v for which this 
is true. Let I(v,S) = (v’, S’) where v’ is v with beads 7 and & interchanged, and $’ = (5 ~ 
{j}) U {k}. It is immediately verified that sgn(v’, 5”) = —sgn(v,S), wt(v, 5) = wt(v'’, S’), 
and I(v’, 5”) = (v,S'). So I cancels all objects in which a bead collision occurs. Oo 


11.9 Pieri Rule for h;, 


In the last section, we computed ay+5()ex by using a k-element subset of {1,2,...,N} to 
move beads on a labeled abacus. Now we will compute a)45(~yhx by moving beads based 
on a k-element multiset. This approach is motivated by the formula 


he(a1,---,2nN) = S- II vj, 


where the factor x; is repeated as many times as 7 appears in M. 

Suppose v is an abacus counted by a+ 5), and aj"! ---ah% is a typical term in hz (so 
each m; > 0 and m,+---+my =k). Scan the beads in v from left to right. Whenever we 
encounter a bead labeled j, we move it right, one step at a time, for a total of m,; positions. 
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Bead collisions may occur and will lead to object cancellations later. If no collision occurs, 
we will have a new abacus v* € LAbc(y + 6(N)) with the same sign as v and weight 
wt(v*) = wt(v)ay""--- a’. It follows from the bead motion rule that the shape v arises 
from A by adding a horizontal k-strip to 4. Conversely, any abacus indexed by such a shape 
can be constructed from an abacus indexed by A by a suitable choice of the bead multiset. 
These ideas are illustrated in the following example, which should be compared to the 
example in the preceding section. 

11.43. Example. Let N = 6 and \ = (3,3, 2,2). Let v be the following abacus in LAbc(A+ 
6(6)): 

012 3 4 5 6 7 8 9 


1)-e—e-(3)-(2)-0-(4){6)-@ 


Let M be the multiset [1, 1,2]. It is possible to move bead 1 to the right twice in a row, and 
then move bead 2 once, without causing any collisions. This produces the following abacus: 


0123 4 5 67 8 9 


5 )-e—e(1)(3)-e(2)(4)(6)-0 


This abacus lies in LAbc(v + 6(6)), where v = (3,3,3,2,2) arises from A by adding a 
horizontal 3-strip: 


If instead we take M = [1, 2,6], we move bead 1, then bead 2, then bead 6, leading to this 
abacus in LAbc((4, 3, 3, 2, 1) + 6(6)): 


0123 4 5 67 8 9 


On the other hand, suppose we try to modify v using the multiset M = [1,2,3]. When 
scanning uv from left to right, bead 3 moves before bead 2 and collides with bead 2. We 
match the pair (v,M) with the pair (w,N), where N = [1,2,2] and w is the following 
abacus: 


0123 4 5 6 7 8 9 
1)-e—e-(2)-(3)-0-(4)(6)-@ 


Observe that sgn(w) = —sgn(v) and wt(v)a1r2%3 = wt(w)a 1x73. This example illustrates 
the cancellation idea used in the proof below. 


11.44. Theorem: Antisymmetric Pieri Rule for h;. For all \ € Pary and all k > 1, 
the following identity holds in K[x1,..., an]: 


@45(w)(21,--., EN )hg(21,...,0nN) = S- 4g45(N)(@1,---,2N). 


BeParn: 
B/X is a horizontal k-strip 
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Proof. Let X be the set of pairs (v,M) where v € LAbc(\ + 6(N)) and M = 
[1m122.-.N™N] is a k-element multiset. Putting sgn(v,M) = sgn(v) and wt(v,M) = 


wt(v) ie as we have 


ay46(N) hk = S- sgn(z) wt(z). 

zZEX 
Define an involution I: X — X as follows. Given (v,M) € X, scan the abacus v from 
left to right. When bead 7 is encountered in the scan, move it m,; steps right, one step 
at a time. If all bead motions are completed with no bead collisions, we obtain an abacus 
v* counted by the sum on the right side of the theorem, such that sgn(v) = sgn(v*) and 
wt(v, M) = wt(v*). In this case, (v,M) is a fixed point of J, and the bead motion rule 
defines a sign-preserving, weight-preserving bijection between these fixed points and the 
abaci counted by the right side of the theorem. 

Now consider the case where a bead collision does occur. Suppose the first collision 
occurs when bead j hits a bead k that is located p < mj; positions to the right of bead j’s 
initial position. Define I(v, M) = (v', M’), where v’ is v with beads j and k interchanged, 
and M’ is obtained from M by letting 7 occur m; — p > 0 times in M’, letting k occur 
mr +p times in M’, and leaving all other multiplicities the same. One may check that 
sgn(v’, M’) = —sen(v, M), wt(v, M) = wt(v’, M’), and I(v’, M’) = (v, M). So I cancels all 
objects in which a bead collision occurs. O 


DS 


11.10 Antisymmetric Polynomials and Schur Polynomials 


The Pieri rule for computing a)45(yyhx closely resembles the rule for computing s)hx 
from §10.12. This resemblance leads to an algebraic proof of a formula expressing Schur 
polynomials as quotients of antisymmetric polynomials. 


11.45. Theorem: Schur Polynomials and Antisymmetric Polynomials. For all 
r€ Parn, 


N+N-i 
— aagsny(@1,.-.,2n) _ det fai ™ h<igen 
8,(#1,--.,2n) = = —_4 
a(n) (1,---,N) det lla ~"|ln<ij<cn 


Proof. In 10.69, we iterated the Pieri rule 


$y(@1,..., Un )he(“1,...,2N) = S- $3(@1,...,2N) 
BEParn: 
B/v is a horizontal k-strip 


to deduce the formula 


hy (@igsscyan)= S. Kxyiex(@iyicyen) (iv € Par). (11.3) 
A€Parn 


Recall that this derivation used semistandard tableaux to encode the sequence of horizontal 
strips that were added to go from the empty shape to the shape A. Now, precisely the same 
idea can be applied to iterate the antisymmetric Pieri rule 


Gy45(N)(£1,---, En )he(21,...,2N) = S- a345(N)(21,+++,2N)- 


BEPary: 
B/v is a horizontal k-strip 
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If we start with v = (0) and multiply successively by h,,,h .., we obtain the formula 


M1" "29° 


4946(N)(21,-+-,2n)hy(t1,...,2N) = S- K),y0)46()(21,-++, EN) (u € Par). 
A€Parn 

(11.4) 
Now restrict attention to partitions »,44 € Pary(m). As in 10.72, we can write equa- 
tions (11.3) in the form H = K‘S, where H = (h, :  € Parn(m)) and S = (sy : 
A € Pary(m)) are column vectors, and K‘ is the transpose of the Kostka matrix. Let- 
ting A = (a4)45(N)/as(w) : A € Parn(m)), we can similarly write equations (11.4) in the 
form H = K‘A. Finally, since the Kostka matrix is invertible (being unitriangular), we can 
conclude that 

A = (K')'H=S. 

Equating entries of these vectors gives the desired result. O 


A purely combinatorial proof of the identity ay45(~) = $\a@5() Will be given in §11.12. 


11.11 Rim-Hook Tableaux 


The connection between Schur polynomials and antisymmetric polynomials lets us deduce 
the following Pieri rule for calculating the product spr. 


11.46. Theorem: Symmetric Pieri Rule for p,. For all } € Pary and all k > 1, the 
following identity holds in K[x1,..., an]: 


8 (@1,..-,2n)pe(@1,.--,0N) = S- sgn(R)se(a1,..-,0N). 


BEParn: 
B/X is a k-ribbon R 


Proof. Start with the identity 


@46(N)(01,+--,2N)PE(F1,...,0N) = S- sen(R)ag45(n)(@1,-.-,2N) 
BEeParn: 
B/X isa k-ribbon R 
(proved in 11.39), divide both sides by as), and use 11.45. oO 


11.47. Example. Suppose we multiply s(9) = 1 by p4 using the Pieri rule. The result is a 
signed sum of Schur polynomials indexed by 4-ribbons: 


P4 = 8(0)P4 = $(4) — §(3,1) + $(2,1,1) — $(4,1,1,1)- 
To expand pv4,3) into Schur polynomials, first multiply both sides of the previous equation 
by ps: 
P(4,3) = P4P3 = $(4)P3 — §(3,1)P3 + §(2,1,1)P3 — §(1,1,1,1)P3- 
Now use the Pieri rule on each term on the right side. This leads to the diagrams shown in 
Figure 11.3. Taking signs into account, this leads to the formula 


P(4,3) = 8(7) + 84,3) — 8(4,2,1) 1 §(4,1,1,1) 
—8(6,1) + §(3,2,2) — $(3,1,1,1,1) 
+8(5,1,1) — $(3,3,1) + §(2,1,1,1,1,1) 


—$(4,1,1,1) + $(3,2,1,1) — $(2,2,2,1) — $(4,1,1,1,1,1,1)- 


Here we are assuming N (the number of variables) is at least 7. 
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Shapes for s(4)p3: 


CCCLFEF) GEE 


Shapes for —5(3 1)p3: 


{Lt 


LT SPePs Ts 2c 
B || 
ie |_| 


Shapes for s(2,1,1)p3: 


Shapes for —8(4,1,1,1)P3: 


FIGURE 11.3 
Adding k-ribbons to compute spr. 
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Just as we used semistandard tableaux to encode successive additions of horizontal 
strips, we can use the following notion of a rim-hook tableau to encode successive additions 
of signed ribbons. 


11.48. Definition: Rim-Hook Tableaux. Given a partition \ and a sequence a € N*, 
a rim-hook tableau of shape and content a is a sequence T of partitions 


(QS Or" Gpre se Cir aK 


such that v*/v’~! is an a;-ribbon for 1 < i < s. We represent this tableau pictorially by 
drawing the diagram of \ and entering the number 7 in each cell of the ribbon v’/v'~!. The 
sign of the rim-hook tableau T is the product of the signs of the ribbons v*/v*~1. (Recall 
that the sign of a ribbon occupying r rows is (—1)"~'.) Let RHT(A,q) be the set of all 
rim-hook tableaux of shape 4 and content a. Finally, define the integer 


xa= So sen(Z). 


TERHT(A,q) 


Rim-hook tableaux of skew shape \/ 1 are defined analogously; now we require that v° = 1, 
so that the cells of 4 do not get filled with ribbons. The set RHT(A/, a) and the integer 
x2/ * are defined as above. 

11.49. Example. Suppose we expand the product papop3p; into a sum of Schur polyno- 
mials. We can do this by applying the Pieri rule four times, starting with the empty shape. 
Each application of the Pieri rule will add a new border ribbon to the shape. The lengths 
of the ribbons are given by the content vector a = (4, 2,3, 1). Here is one possible sequence 


of ribbon additions: 
AHS RAE 
| | 
4 Hy Hy" “CHE 


This sequence of shapes defines a rim-hook tableau 


T = ((0), (2,1, 1), (2, 2, 2), (4, 3, 2), (4,3, 3)) 


which can be visualized using the following diagram: 


EERIEIE 
P23). 
2/4] 


Note that the ribbons we added have signs +1, —1, —1, and +1, so sgn(T) = +1. This 
particular choice of ribbon additions will therefore produce a term +5(4,3,3) in the Schur 
expansion of p4,2,3,1): 

Now suppose we want to know the coefficient of s(4,3.3) in the Schur expansion of 
pap2p3pi1. The preceding discussion shows that we will obtain a term +5(4,3.3) for every 
rim-hook tableau of shape (4, 3,3) and content (4,2,3,1), where the sign of the term is the 
sign of the tableau. To find the desired coefficient, we must enumerate all the objects in 
RHT((4, 3, 3), (4, 2,3, 1)). In addition to the tableau T displayed above, we find the following 


tableaux: 
1fi[3]4] = [afaf2]2}) afafay4)) = aff] 1] 
[1] 2[3) [1] 3/3) [1] 2| 2 [21 3]3) 
[1[2[3) a [3[3] 3) [213] 4) 


The signs of the new tableaux are —1, —1, —1, and +1, so the coefficient is +1—1—1-—-1+1 = 
1. 


Abaci and Antisymmetric Polynomials 473 


The calculations in the preceding example generalize to give the following rule for ex- 
panding power-sum polynomials into sums of Schur polynomials. 


11.50. Theorem: Schur Expansion of Power-Sum Polynomials. For all vectors 
a € Nand all N > 1, 


Do(@1,---,2N) = S- x28)(x1,...,0N). 
A€Parn 


Proof. By iteration of the Pieri rule, the coefficient of 5) in pa = §(9)Pa; *** Pa, 18 the signed 
sum of all sequences of partitions 


O=r29Crvicr*c...-cryr= 


such that the skew shape v*/v*~! is an a;-ribbon for all i. By the very definition of rim-hook 
tableaux, this sum is precisely y2. O 


11.51. Theorem: Symmetry of >. If a and 3 are compositions with sort(a) = sort(), 
then x2 = xp for all partitions 4. 


Proof. The hypothesis implies that the sequence a = (a1, Q@2,...) can be rearranged to the 
sequence 3 = (31, G2,...). It follows from this that pa = [[ pa, = [] Ps; = pg, since multi- 
plication of polynomials is commutative. Let k = 57, a; and take N > k. Two applications 
of the previous theorem give 


ye X25. = Pa = pe = S- XBSa: 


XE Par(k) AE Par(k) 


By linear independence of the Schur polynomials {s)(a#1,...,¢x) : A € Par(k)}, we conclude 
that x2 = xp for all A. O 


11.52. Remark. The last theorem and corollary extend to skew shapes as follows. If yu is 


a partition, then 
SuPa= > xQ/#s. 
A€Parn: 
wOA 
Furthermore, if sort(a) = sort(@) then x2/“ = ie. The proof is the same as before, 
replacing (0) by yz throughout. 


We have just seen how to expand power-sum symmetric polynomials into sums of Schur 
polynomials. Conversely, it is possible to express Schur polynomials in terms of the p,,’s. 
We can use the Hall scalar product from §10.26 to derive this expansion from the previous 
one. 


11.53. Theorem: Power-Sum Expansion of Schur Polynomials. For N > k and all 


 € Par(k), 
BN 


xX 
$\(@1,---,0nN) = oS Spyl@1, +++) tN). 
pe Par(k) H 


Proof. For all « € Par(k), we know that p, = >> 
partition A € Par(k), 


vePar(k) XuSv: Therefore, for a given 


Gen = SX an) =e 
v€Par(k) 
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since the Schur polynomials are orthonormal relative to the Hall scalar product. Now, since 
the p,,’s form a basis of A we know there exist scalars c, € Q with s, = 5°, cypy. To find 
a given coefficient c,,, we compute 


a = (Py, $d) = Se (Pus Pv) = Cuzp, 
V 


where the last equality follows by definition of the Hall scalar product. We see that c, = 
MN | Zins as desired. Oo 


11.12 Abaci and Tableaux 


This section contains a combinatorial proof of the identity 


as(ny(@1,-- “Salers 3406) = @45(N) (£1, - se ,tn), 


which we proved algebraically in §11.10. 
Let X be the set of pairs (v,T), where v is a justified labeled abacus with N beads and 
T is a semistandard tableau using letters in {1,2,...,N}. It will be convenient to use the 
following non-standard total ordering on this alphabet that depends on vu: 7 <, 7 iff bead 7 
is to the right of bead j on the abacus v. Equivalently, we can describe the total order by 
writing 
UN-1 <v UN-2 <u °°* <v U1 <v VO- 


Here are two examples of objects in X when N = 7 and A = (7,7, 5,3, 2): 


(v',T") = | 7654321000--- , 


(v?,T?) = | 2451763000: -- , 


Note that we can pass from the first tableau (which is semistandard under the usual or- 
dering) to the second tableau (which is semistandard relative to one of the non-standard 
orderings) by applying the permutation 7 +> 2,61 4,... to each entry in the first tableau. 
It follows that the generating function for the set SSYT (A) relative to one of the orderings 
<, can be obtained from the usual generating function for semistandard tableaux (namely 
8(@1,.--,2n)) by applying the permutation x7 +> x2, 26 > X4,.... Since Schur polynomi- 
als are symmetric, the answer is still s,(@1,...,¢x). By the product rule for weighted sets, 
we conclude that 


S- sgn(v) wt(v) wt (7) oa asin) (£1, ee ,€N)8)(2X1, asare ,EN). 
(v,T)EX 


On the other hand, the generating function for the set Y of N-bead labeled abaci with beads 
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in positions A+4d(N) is ay45(w)(@1,---,;£N). So it suffices to define a sign-reversing, weight- 
preserving involution I : X — X where the fixed point set of J corresponds bijectively to 
Y. 

The main idea is that the tableau T encodes a sequence of bead motions on the abacus 
v. If performing these movements causes a bead collision, then (v, 7) will cancel with some 
other object in X. Otherwise, the abacus obtained from v by the bead motions will be one 
of the objects in Y. 

A tableau T’ specifies bead motions as follows. Define the word of T to be the word 
w(T) = wiw2--- wr, (where n = |A|) obtained by concatenating the rows of T from bottom 
to top. For example, the object (v?,77) shown above has 


w(T?) = 224447755566666663333333. 


Now, given (v,T) € X, scan the symbols in w(T) from right to left. When a symbol j is 
encountered, move the bead labeled j in v one step to the right. 

Let us first determine which objects (v,T) have no bead collisions. Suppose v = 
Ug ..-Un—100.... Let 7 be the last entry in the top row of T,, which is the rightmost letter 
in w(T). We must first move bead 7 one step to the right. This move will already cause a 
collision (since v is justified) unless i = vy_1. Since vy_j is the smallest letter relative to 
<, and T is semistandard, i = vy_ iff all entries in the top row of T are equal to vy_1. 
In this situation, we will move the rightmost bead vy_ , to the right A, positions with no 
collisions. 

Now we repeat the argument on the second row of T’. The rightmost entry j in this row 
cannot be vy_1 (otherwise we would not have a strict increase in every column). The only 
way to avoid an immediate bead collision is when 7 = vy—2, in which case all entries in 
the second row must equal vy_—g. In this situation, bead vy—z will move to the right A» 
positions with no collisions. 

Continuing similarly, we see that (v,T) will have no collisions iff for all k, the kth row 
of T consists of Ax copies of the Ath smallest letter uv_~. Moving the beads on v according 
to T has the effect of unjustifying v to an abacus v* € Y = LAbc(A + 6(N)). Defining 
I(v,T) = (v,T) in this case, we therefore have specified a bijection between the fixed points 
of J and Y. For example, 


(v,T) = | 2451763000--- , + v™ = 24005010070063000:-- . 


The map (v,7') + v* preserves signs and weights. 

To complete the proof, we describe a cancellation mechanism to pair off objects (v, T) in 
which bead collisions do occur. Suppose the first bead collision for (v,T’) occurs when some 
bead 2 moves to the right one step and bumps into bead 7. Note that 7 >, 7, and 2,7 must 
be two adjacent letters in the total ordering >,. Define (v’,T’) = I(v,T) as follows. We 
obtain v’ from v by interchanging the adjacent beads i and 7, so that sgn(v’) = — sgn(v), 
wt(v')a; = wt(v)z;, and <, agrees with <, except that now i <,, 7. 

We obtain T’ from T by modifying the occurrences of i and j in w(T) by a procedure 
similar to the one used in §10.6. By the argument used to determine the fixed points of J, 
we know that the occurrence of 7 in T that caused the bead collision is the rightmost entry 
in some row of T, say the kth row; furthermore, for 1 <1 < k, row | consists of A; copies of 
vn_1. Now i >» un—x (or this entry of T would not cause a collision), and so j >) vn—r. 
This means that no entry in the first k — 1 rows of T equals i or j, so these rows can be 
ignored in the following discussion. 
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We now describe how to change T into T’. Whenever j occurs directly above i in T 
(call these occurrences matched pairs), interchange these two symbols. Some rows of T will 
contain unmatched 7’s and j’s, in which a > 0 copies of j are followed by b > O copies 
of 2. In particular, row & will have a > 0 and b > 0, since the 7 at the end of the row 
cannot be matched with a j above it. In row k, replace the unmatched symbols j%i°? by 
j7tlje-1, Then, in all rows containing unmatched i’s and j’s (including the new row k), 
replace the unmatched symbols j7i° by i°j%. The following assertions can now be checked: 
T’ is a semistandard tableau relative to <,; T’ has one fewer i and one more j than T 
does; wt(T’)x; = wt(T)x,; wt(v', T’) = wt(v, T); sgn(v’, T’) = — sgn(v); the last symbol in 
row k of T’ is an unmatched j; this unmatched j will cause the first bead collision when T’ 
is used to move the beads on wv’; and I(v', T’) = (v,T). 


11.54. Example. Consider the object 


(v,T) = | 2451763000--- , 


Processing the first two rows of T’, we move bead 3 right seven positions, then move bead 
6 right 7 positions with no collisions. But in row 3, the rightmost symbol 7 = 5 causes a 
collision with bead 7 = 1. There are no matched pairs of 5’s and 1’s in this tableau, so 
we first change the 555 in row 3 to 155, and then change this string to 551 to preserve 
semistandardness under the new ordering. We have 


I(v,T) = | 2415763000: --, 


If we apply I to this object, bead 1 bumps into bead 5, and we find that I(I(v,T)) = (v,T). 
11.55. Example. Consider the object 


(v,T) = | 2451763000--- , 


Now the first collision occurs when bead 7 = 7 bumps into bead 7 = 6 because of the 7 at 
the end of the second row of T. The first two 6’s in that row are matched with 7’s below, 
so the unmatched 2’s and j’s in row 2 are 67777. We replace this string first by 66777, and 
then by 77766. Interchanging the matched 6’s and 7’s leads to 


I(v,T) = | 2451673000: , 
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11.13. Skew Schur Polynomials 


In the remainder of this chapter, we will develop further combinatorial properties of skew 
Schur polynomials. Recall the definition from 10.14: for every skew shape A/p, 


8) /y(£1,---,2N) = ye oP), 
TESSYTy (A/p) 


In 10.35, we proved that skew Schur polynomials are symmetric. More precisely, we have 
the expansion in the monomial basis: 


S§/p(21,--.,0N) = oD Knotts Gig eg EN) 


veParn 


where Ky/,,, is the number of semistandard tableaux of shape A/y and content v. Our 
current goal is to find combinatorial formulas for the expansion of skew Schur polynomials 
relative to some other bases for A. We begin by proving an algebraic fact involving the Hall 
scalar product. 


11.57. Theorem: Skew Schur Polynomials and the Hall Scalar Product. Suppose 
A, € Par, k = |A| — |u|, N > |Al, and f €¢ Ah. Then (s/f) = (8), Spf). 


Proof. We first prove the result for f = h,, where v € Par(k). On one hand, we have the 
expansion 


Sr/p = Ss By /p,p™p- 
p€Par(k) 


Taking the scalar product of both sides with h, gives (sy/,,hv) = Ky/yv (see 10.132). 
On the other hand, the Pieri rule shows that 


Suhy = SS Kaine 8p 
p 


(see 10.71). Taking the scalar product with s, gives (s,,8,hv) = Ky/y,- Thus the result 
holds for every f in the complete homogeneous basis. 

The general case now follows by linearity: given any f € A‘, write f = do, cvhy for 
certain scalars c, € kK. Then compute 


(Sjwot) = (aye Doere) = Debate 


Vv Vv 


SGGyealy) = (3. Dente) =(s,,8,f). O 


Vv 


I 


We can use 11.57 to expand skew Schur polynomials in terms of power-sum symmetric 
polynomials. 


11.58. Theorem: Power-Sum Expansion of Skew Schur Polynomials. Suppose p C 
d are partitions with & = |X| — |u|. For all N > |Al, 


Ait 
S§/p(£1,.-.,0N) = Dv (a1,..-,2N). 
v€Par(k) ie 
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Proof. We imitate the proof of 11.53. Start with the expansion 


Sip SK Pia 
Xr 


Now take the scalar product of both sides with a given partition A: 


(Sy, SuPv) =. yah 


We know the symmetric polynomial s)/,, has some expansion in the power-sum basis, say 
8\/p = do, Wwpv for some a, € K. To find a particular a,, take the scalar product with 
Pv | & to get 


ay = (Sx yjus Dod Ze) — (Sy, SuPv/Zv) a (Sy, SuPv) [2% = ell a O 
We also deduce the effect of the involution w on skew Schur polynomials. 


11.59. Theorem: Action of w on Skew Schur Polynomials. For all partitions uC X 
and all N > |Aj, 
Bl SxviplPigsy yn) = Sy fal (Hin 2 eM) 


Proof. We already know that the involution w is a ring homomorphism and isometry sending 
every Sq to Sq. For each partition v of size |A| — |u|, we can therefore write: 


(w(8r/u)s 80) = (#?(8r/n), (80) = (Sa/us 80") = (823 880") 


= (w(8y), W(Sp")) = (82' Sy 8v)) = (8y//prs Sv) - 


Thus w(s)/,,) and s)//, have the same expansion in the Schur basis and are therefore 
equal. O 


| 


11.14 Jacobi-Trudi Formulas 


Our next goal is to obtain formulas expressing skew Schur polynomials as determinants 
involving the complete symmetric polynomials h; or the elementary symmetric polynomials 
ex. To derive these results, we need a new combinatorial construction relating tableaux to 
collections of non-intersecting lattice paths. 

We begin by interpreting h,(#1,...,2)) in terms of lattice paths. Fix an integer a and 
consider the set S of lattice paths from (a,1) to (a+k,N) that take unit steps up (u) and 
east (ec). We can encode a path p in this set by listing the y-coordinates of the successive east 
steps of p. For example, the path eeuueuee corresponds to the sequence 1, 1,3, 4, 4. This gives 
a bijection from S$ to the set of weakly increasing sequences 1 < i; < ig < +++ Sip < N. 
Let us weight the path corresponding to this sequence by 2;, i, ---;,. Comparing to the 
definition of hz, we see that 


he(a1,.--,2N) = S > wt(p). 


pes 


This formula holds for all integers & (possibly negative), provided we take ho = 1 and hy = 0 
for negative k. 

Now let \ be a partition with n < N parts, and let uw C A. Let X be the set of fillings of 
the skew shape A/w using letters in {1,2,...,N} such that each row weakly increases. Let 
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Y be the set of sequences P = (p1,...,Dn) where p; is a lattice path from (n — 7 + j4;, 1) 
to (n-1+;,N). Let wt(P) = wt(p,)---wt(p,) record the y-coordinates of all the east 
steps of the paths in P. As explained above, we can encode each row i of a filling U € X 
as a lattice path p; from (a,1) to (a+ A; — wi, N), where a = n—i+4 py. The association 
U + (p1,.--,Pn) defines a weight-preserving bijection f : X — Y. Some examples are 
shown in Figure 11.4. 

We say that two lattice paths intersect iff they share a common edge or vertex. Let Y’ 
be the set of P € Y such that no two paths in P intersect. Inspection of Figure 11.4 suggests 
that f restricts to a weight-preserving bijection from SSYTy(A/j) to Y’. To see why this 
holds, consider consecutive entries U(i,7) = a and U(i+ 1,7) = b in column j of a filling 
U € X.In f(U), path p; has an east step from (n-—i+p;4+(j—pi)—1,a) = (n+j-i-1,a) 
to (n+ j —i,a), whereas p;;1 has an east step from (n+ 7 — 1 — 2,6) to (n+ 7 —1—1,)). 
Suppose a > b. Since the beginning of p; goes from (n— i+ p;,1) to (n+ 7 —i-—1,a), there 
is no way for pj, (which starts to the left of p;) to reach the point (n+ 7 —i—1,b) without 
intersecting p;. Conversely, suppose two paths intersect. Then there must exist 7 such that 
p; and p;+1 intersect. The earliest intersection of these paths must occur when p;+1 “bumps 
into” p; by taking an east step ending at some point (n + 7 —i— 1,6). One may now check 
that there must exist an east step in p; starting at (n+ 7 —i-—1,a) for some a > b, which 
shows that U(z,7) > U(i+ 1,7) in the filling U. 

Now we are ready to prove the Jacobi-Trudi formulas. The idea is to introduce a large 
collection of signed, weighted sequences of paths that model the terms of a determinant. 
Cancellations will remove all sequences of intersecting paths, leaving only the objects in Y’, 
which correspond to semistandard skew tableaux. 


11.60. Theorem: Jacobi-Trudi Formula. Suppose 4 is a partition with n < N parts, 
and pw C A. Then 


8y/p(L1,---,2N) = det |[hy,-p;+5-4(21,---, Bw) ||1<i,7<n- 


Proof. By the definition of a determinant (see 9.37), the right side of the desired formula 
can be written 


S- sgn(w) II Pd. - py tw(i)—i(£1,---, EN). 
i=l 


WESn 


This is the generating function for the following signed, weighted set. Let Z be the set of 
sequences (w,p1,-..,Pn) such that w € S, and p; is a path from (n — w(t) + Hw), 1) to 
(A; +n —i, N). The weight of such a sequence is [];_, wt(p;), and the sign is sgn(w). 

The following involution will cancel all objects (w, pi,..., Pn) in which two or more paths 
intersect. Among all lattice points (u,v) where two paths intersect, choose the one for which 
u is minimized; if there are ties, choose the point that minimizes v. Let 7 < 7 be the two 
least indices such that p; and p,; pass through (u,v). Write p; = gr where gq (resp. r) is the 
part of p; before (resp. after) the point (u,v). Similarly write p; = st. Now, pair the given 
object with the object (w’,pi,.--,P;,) where w’ = wo (7,9), p; = sr, pj = qt, and p, = pr 
for all k 47,7. (Thus we have switched the initial segments of the two intersecting paths.) 
One may check that the new object lies in Z and has the same weight and opposite sign as 
the original object. One should also check that applying the map a second time will restore 
the original object, so we have an involution. Some examples are shown in Figure 11.5. 
(Note that path p; goes from the w(i)th point from the right on the line y = 1 to the ith 
point from the right on the line y = N.) Let us consider an object (w,pi1,...,Pn) in Z that 
is not canceled by the involution. No two paths in this object can intersect. We claim that 
this forces w = id. For otherwise, there would exist i < j with w(t) > w(j). But then p; 
would start to the left of p; on the line y = 1 and end to the right of p; on the line y = N, 
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Cancellation mechanism for intersecting paths. 
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which would force p; and p; to intersect. So w = id. Erasing w maps the fixed points in Z 
bijectively to the set Y’, which in turn maps bijectively to SSYTy(A/), as shown in the 
discussion preceding the theorem. O 


11.61. Theorem: Second Jacobi-Trudi Formula. Suppose \ is a partition with A; = 
n< N, and p CX. Then 


Sal Sis ok ,tN) = det len — 4 4 5-i(@1, eres tN )|l1<i,j<n- 
Proof. For all fi; € Aw, we have w(det ||f;;||) = det ||w(fi;)||- This follows from the defining 
formula for determinants and the fact that w is a ring homomorphism. Now 11.61 follows 
by applying w to both sides of the first Jacobi-Trudi formula 


Sy! /p! = det ||hy—y/ 45-ill- = 


11.62. Example. According to the first Jacobi-Trudi formula, 


hg ha hs 
(3,31) = det | ho hg ha | =hegz.ay + h,2) — Aya.) — hya2,1)- 
0 1 hy 
Note that the main diagonal entries in the formula for s, are hy,,h),,...,h,, and the 


subscripts increase by 1 (resp. decrease by 1) as we read to the right (resp. left) along each 
row. Similarly, 


€3 €4 €5 
8(3,3,1) =det | e1 €2 e€3 | = €(3,2,2) + €4,3) + €(5,1,1) — €(5,2) — €(3,3,1) — €(4,2,1)- 
1 ey €2 


Here is a typical expansion of a skew Schur polynomial: 


aa ae 
8(5,5,3)/(3,2,0) = det | hi hs he | = hea.) + hc7,1) — Aya,3,1) — h6,2)- 
0 1 hg 


11.15 Inverse Kostka Matrix 


In Chapter 10, the Kostka matrix played a prominent role in relating the Schur basis of Aj 
to several other bases. More specifically, we proved the formulas 


SX = S Ky pM, hy = S Ky,uSn, eu = S Ky,y 82’ 
LU r r 


where all symmetric polynomials have N variables and all summations extend over Pary. 
Letting K = Ky be the matrix of Kostka numbers with rows and columns indexed by 
elements of Pary, these relations can also be written 


s= Km, h = K's, e = K'w(s). 


We know that the Kostka matrix is invertible (being unitriangular). Let kK ,, be the 
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entry in row A and column p of the inverse of the Kostka matrix. Inverting the relations 
above, we see that 


y y / 
Mm, = y Ky pSp Su = y Ky ahh, Sy = y Ky per: 
d d 


bb 


Observe that the determinant formulas in the previous section, which express Schur poly- 
nomials in terms of complete homogeneous symmetric polynomials, give algebraic interpre- 
tations for the coefficients K\ d,p Here we wish to derive combinatorial interpretations for 
these coefficients. To do this, we need the concept of a special rim-hook tableau. 


11.63. Definition: Special Rim-hook Tableaux. For A, © Pary, a special rim-hook 
tableau of shape uw and type X is a rim-hook tableau S of shape y and content @ such that 
sort(@) = A and every nonzero rim-hook in S$ contains a cell in the leftmost column of the 
diagram of yz. The sign of such a tableau is defined as in 11.48. Let SRHT(:, A) be the set 
of special rim-hook tableaux of shape yz and type X. 


11.64. Theorem: Combinatorial Interpretation of Inverse Kostka Matrix. For all 
A, € Parn, 


Ea, = > sgn(S). 


SESRHT(,A) 


Proof. We intend to give a combinatorial proof of the identity 


Gs (ips EN IND Gt, t = > S- sgn(S)a,45(N)(@1,-.-,£N)- 
wEeParn SESRHT(p,)) 


Once this is done, the theorem will follow by dividing both sides by as(y) and comparing 
the resulting identity to the known expansion m) = >> " Ks. usp: 

To prove the identity, we study a combinatorial interpretation of the product a5(~ym) 
involving abaci. The polynomial a5) represents a justified abacus containing N beads 


labeled w(NV),...,w(1) in positions 0,...,.N — 1 (respectively). Given oy an abacus, we 
can view m)(a1,...,@n) as the sum of all distinct monomials The: ier such that the 
exponent sequence (e(1), ...,e(N)) is a rearrangement of (A1,...,AN). (Here and below, 


we view elements of Pary as partitions with exactly N parts, some of which may be zero.) 
The multiplication of as(v) by one of these monomials can be implemented on the abacus as 
follows. Imagine moving the N justified beads from their current runner to a new, initially 
empty runner, by moving each bead w() from position N — i on the old runner to position 
N—i+e(i) on the new runner. Call such a transformation of the justified abacus a A-move. 
A given A-move either causes a bead collision on the new runner, or else produces a new 
abacus, which is enumerated by a monomial in a,,45(v)(©1,---,©n) for some ps € Pary. 
Consider the situation where a bead collision occurs. Choose 7 minimal such that bead 
w(z) collides with some other bead on the new runner, and then choose j minimal such that 
bead w(i) collides with w(j). Create a new object counted by 4g(y)M by switching beads 
w(t) and w(j) on the old abacus, and switching e(2) and e(j) in the exponent vector. This 
defines a sign-reversing, Haiahiepreservine involution that cancels all objects in which bead 
collisions occur. 
To complete the proof, we must find a sign-preserving, weight-preserving bijection ¢ 
from the set X of uncanceled objects counted by as(~ym, to the signed weighted set 


LJ SRHT(u, ) x LAbe(u + 6(N)). 


weParn 
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FIGURE 11.6 
A special rim-hook tableau. 


For this purpose, let us fix w € Pary and consider the ways in which a justified abacus with 
N beads can be transformed into an abacus in LAbc(yu + 6(N)) by means of a A-move. Let 
us temporarily ignore bead labels and signs, concentrating at first only on the positions of 
the N beads. The positions of the N beads on the old runner are the entries in the sequence 
6(N) = (N —1,N — 2,...,2,1,0). A A-move adds some rearrangement of the sequence 
A = (A1,.-.;An) to the sequence 6(N). We will obtain an abacus in LAbe(y + d(V)) iff 
the sum of these sequences is some rearrangement of the sequence ys + O(N) = (41 + N - 
1,...,un +N-N). 

We now show that the rearrangements of \ that produce abaci in LAbc(u+6(NV)) can be 
encoded by special rim-hook tableaux of shape pu and type X. The proof will use induction on 
N. Let us first illustrate the idea of the proof by considering an example. Take N = 5, up = 
(7,5,4,4,2), and A = (8, 7,6, 1,0). We seek rearrangements of the vector (8, 7,6, 1,0) which, 
when added to the vector (4, 3,2, 1,0), produce a rearrangement of +6(N) = (11,8, 6,5, 2). 
In this example, the only solution turns out to be (1,8, 0, 7,6) +(4, 3, 2, 1,0) = (5, 11, 2,8, 6). 
We can visualize this solution using the special rim-hook tableau in Figure 11.6, in which 
the rim-hooks (from top to bottom) have lengths (1,8,0,7,6). If we start with a labeled 
justified abacus 54321000--- and perform a \-move using the rearrangement (1, 8, 0,7, 6), 
we obtain the abacus 003001504002000--- € LAbc(11,8,6,5,2). The sign of this abacus, 
namely sgn(24513) = —1, differs from the sign of the original abacus, namely sgn(12345) = 
+1, by a factor of (—1)° = sgn(S). A similar remark holds if the original abacus had involved 
some other permutation of the five labels. 

With this example in mind, we return to the general proof. We are seeking permutations 
ji++:jn and ki ---ky satisfying the system of equations 


Oy, = ee RN en 
1+ Agi = Pky + N-kyn-1 (11.5) 
N-1+%j, — [bey + N—- ky 


In particular, to satisfy the first equation, we need an index 7 = jy and an index k = ky 
such that A; = py + N — k. If such an index exists, we encode it by drawing the unique 
border ribbon of length A; starting in the leftmost cell of row N of yu. By choice of j and k, 
this border ribbon must end in the rightmost cell of row k of yz. In terms of the abaci, the 
A-move encoded by ji --: jn moves the bead in position 0 on the old runner (the Nth bead 
from the right) to position ju, +N —& on the new runner (which will become the kth bead 
from the right). Thus this bead “moves past” N — k other beads during the A-move, which 
causes a sign change of (—1)‘~* for any choice of labels. But N — k is precisely the spin of 
the border ribbon we just drew. 

To finish solving system (11.5), let \* be the partition obtained by dropping one part A, 
from A, and let u* be the partition in Pary_, obtained by erasing the cells of occupied by 
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the ribbon that starts in row N. Suppose we ignore the first equation in the system (11.5) 
and subtract 1 from both sides of the remaining N — 1 equations. One may check that 
the resulting system of N — 1 equations is precisely the system we must solve to change a 
justified abacus to an abacus in LAbc(s:* + 6(N —1)) by means of a \*-move. (For instance, 
in the example considered earlier, after we move a bead from position 0 to position 6 
[accounting for the lowest rim-hook in the displayed tableau], we have A* = (8,7,1,0) and 
p* = (7,5,3,1). Having moved one bead, we are left with the task of moving beads from 
positions (4,3, 2,1) = (1, 1,1, 1)+4(4) to positions (11, 8,5, 2) = (1,1,1,1)+y*+4(4) using 
the moves in A* = (8,7,1,0).) By induction on N, the solutions of the reduced system are 
encoded by special rim-hook tableaux S* of shape y* and type A*; and furthermore, the net 
sign change going from the old abacus to the new abacus (disregarding the bead originally 
in position 0) is sgn(S*). It follows that all solutions of the original system are encoded by 
special rim-hook tableaux S of shape w and type A; and furthermore, the net sign change 
going from the old abacus to the new abacus (taking all beads into account) is sgn(S). 
The preceding discussion contains an implicit recursive definition of the desired bijection 
@. More explicitly, suppose z = (w(N)---w(1)000--- ,e(N)---e(1)) € X is an uncanceled 
object counted by as;vymy. Then ¢(z) = (S,v) where v € LAbe(w+4(N)) is obtained from 
the first component of z by moving bead w(2) right e(¢) positions for all 7, and S is the 
unique special rim-hook tableau (of shape y determined by v) that has a rim-hook of length 
e(2) starting in the leftmost cell of row 7 of the diagram. The preceding arguments show that 
@ preserves signs and weights. To compute ¢~1(S,v), it suffices to note that the sequence 
(e(1),...,e(N)) is the content of the rim-hook tableau S. Knowledge of this sequence allows 
us to reverse the A-move and recover w(N)---w(1). Thus, ¢ is a bijection. Oo 


11.65. Remark. An alternate approach to the theorem is to define 


Ky,=  S2 — sgn(S) 


SESRHT(i,A) 


and then give a combinatorial proof of the matrix identity KK’ = I (see 11.127). Since K 
is known to be invertible, it follows that K’ must be the (two-sided) matrix inverse of K. 


11.16 Schur Expansion of Skew Schur Polynomials 


We now consider the expansion of skew Schur polynomials as linear combinations of ordinary 
Schur polynomials. Since the ordinary Schur polynomials are a basis of Aw and the skew 
Schur polynomials are in this vector space, we know there exist unique scalars C n €Q 
such that 


Hittin wn) — YC yen aye, (11.6) 
b 


where it suffices to sum over partitions ji of size |A/v|. The scalars c) ,, are called Littlewood- 


Richardson coefficients. The following result shows that these coefficients are all nonnegative 
integers. Recall that, for a semistandard tableau T of any shape, the word of T is obtained 
by concatenating the rows of T from bottom to top. 


11.66. Theorem: Littlewood-Richardson Rule for Skew Schur Polynomials. For 


all partitions A, 1, V, om is the number of semistandard tableaux T of shape A/v and 


content y such that every suffix of the word of T has partition content. In other words, 
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writing w(T) = wy,w2---W,y, we require that for all k <n and all 7 > 1, the number of 7’s 
in the suffix w,wr41°++ Wp equals or exceeds the number of 4+ 1’s in this suffix. 


Proof. Multiplying both sides of (11.6) by a,j), it suffices to prove the identity 


@g(v)(£1,---,0N)8y/p(#1,.-.,2N) = So eh Guts ny (21, gan): 
b 


The idea is to generalize the proof of the special case v = (0) which we gave in §11.12. 
Model the left side of the desired identity by the set X of pairs (v,T), where v is a justified 
N-bead labeled abacus and T is a semistandard tableau of shape A/v over the alphabet 
{1,2,...,N} ordered by <,. Since skew Schur polynomials are symmetric, the generating 
function for the signed, weighted set X is as(yySy/v- 

We now define an involution I: X — X. Given (v,T) € X, T determines a sequence 
of bead motions on v by reading w(T) from right to left and moving bead k one step to 
the right each time the symbol k is seen. If these bead motions cause a collision, define 
I(v,T) = (v',T’) by the following steps. Suppose the first collision occurs when bead i 
bumps into bead j (where i >, j are adjacent beads in v). Let v’ be v with beads i and j 
switched, so sgn(v’) = — sgn(v) and wt(v’)a; = wt(v)2;. 

Next, we calculate T’ from T as follows. Starting with the word of T, replace each 7 by 
a left parenthesis, each 7 by a right parenthesis, and ignore all other symbols. Match left 
and right parentheses in the resulting string of parentheses, and ignore these matched pairs 
of parentheses hereafter. The remaining unmatched parentheses must consist of a string 
of a > 0 right parentheses followed by a string of b > 0 left parentheses, since if a left 
parenthesis appeared somewhere to the left of a right parenthesis we could find another 
matched pair of parentheses. 

Note that b > 0, since otherwise bead 2 would never bump into bead j. Indeed, the first 
bead collision occurs when we reach the rightmost unmatched left parenthesis (occurrence 
of i) in the word of T. Now, change the subword of unmatched parentheses from “)*(°” to 
«)b—1(a+1” | and then convert all left. parentheses to j’s and all right parentheses to i’s. One 
may verify that the new word is the word of a tableau T’ € SSYTy(A/v), relative to the 
ordering <,’, using the facts that 7 and 7 are adjacent relative to the orderings <, and <,’, 
and the status of a given parenthesis symbol in T’ (matched or unmatched) is the same as 
its status in T. See the example following the proof for more discussion of this point. 

Because T’ has one less i than T and T’ has one more j than T, we have wt(T’)a; = 
wt(T')x;. Since we also had wt(v’)a; = wt(v)a;, we see that wt(v',T’) = wt(v,T). Thus I 
is sign-reversing and weight-preserving. Finally, to check that J is an involution, consider 
what happens when we use TJ’ to move the beads on v’. Bead j on v’ moves the same way 
as bead 7 did on v (and vice versa) until we reach the rightmost unmatched parenthesis 
(relative to i and j) in w(T’). When this symbol is reached, bead j bumps into bead 7 on v’, 
just as bead 7 bumped into bead j on v. To compute I(v’, 7’), we will therefore apply the 
parenthesis modification rule to the i’s and j’s appearing in w(T’). This rule will change the 
unmatched parentheses from “)®~1(2+!” back to “)7(°”, which shows that I(v’,T’) = (v,T). 
So I is an involution. 

All that remains is to analyze the fixed points of I, which are (by definition) the pairs 
(v, T) for which no bead collision occurs. Recall that we are starting with a justified abacus v, 
scanning the symbols in w(T’) = w1---w, from right to left, and moving the corresponding 
beads on v. Suppose all suffixes of T’ have partition content relative to the ordering <, 
(which means the rightmost bead label occurs at least as often in each suffix as the next 
bead label, and so on). We see from the description of the bead motion that no collision 
will occur. Conversely, if the condition is first violated by some suffix wpwp41--: Wn, then a 
collision will occur at this point in the scan. Thus the fixed points of I are the pairs (v, T) 
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such that each suffix of T’ has partition content relative to <,. We map each such fixed 
point to the abacus v* obtained from v by performing the bead motions specified by T. The 
abacus v* lies in the set LAbc(z + 6(N)), where py is the content of T (calculating content 
relative to <,, so {41 is the number of times the rightmost bead moves, etc.). 

We can obtain all the fixed points of I from fixed points of the form (v°,T), where 
v? = (N,N-1,...,1,0,0,...), <,0 is the usual ordering on integers, and T is a semistandard 
tableau satisfying the conditions in the theorem statement. We need only permute the bead 
labels in v° by any w € Sy, and permute the entries of T in the same way. The object (v°, T) 
thereby generates N! fixed points which together contribute one copy of @,,4.5(N)(@1,---, tN) 
to the generating function for the fixed points of J. The total number of times this term 
appears in the generating function is the total number of semistandard tableaux T of content 
ps satisfying the conditions in the theorem. Since the generating function for X must equal 
the generating function for the fixed point set of J, the proof is complete. O 


11.67. Example. To illustrate the parenthesis construction, we compute [(5432100--- ,T), 
where 


The word of T is 12211122221111212222111. The suffix 2222111 of w(T) does not have 
partition content, so this object will cancel with some object (5431200--- , 7’). To find T’, 
first convert 1’s to right parentheses and 2’s to left parentheses in w(T): 


12211122221111212222111 
COI) 6000) O COCO) 


Now we balance parentheses and mark the remaining unmatched symbols: 


COI) 60000) O COCO) 


* * * 


The substring of unmatched parentheses is “))(.” Observe that the rightmost symbol in this 
substring is a left parenthesis corresponding to the first 2 in the offending suffix 2222111, 
and this 2 is the symbol in w(T) causing the first bead collision. As directed by the proof, 
we convert the unmatched parenthesis string to “(((” and then replace left parentheses by 
1’s and right parentheses by 2’s: 


* * * 


€€O) 60000) O COCO) 
11122111112222121111222 


This new word is w(T’), so finally 


Observe that T’ is a semistandard tableau relative to the ordering 5 > 4 > 3 > 1> 2. 
In particular, columns of T’ strictly increase because whenever 1 appears above 2 in T, 
these occurrences of 1 and 2 become matched parentheses. Rearranging the unmatched 
parentheses does not affect these symbols, so in the end we will get a 2 above alin T’. Also, 
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rows of T’ weakly increase since a strict decrease in some row would be encoded as a matched 
parenthesis pair in w(T’), which would have also been matched in w(T), implying that T 
had a strict decrease in some row. But T is a semistandard tableau so this cannot happen. 
Finally, note that the shortest suffix of T’ that does not have partition content (relative to 
the new ordering) is 1111222, where the leftmost 1 corresponds to the rightmost unbalanced 
parenthesis in w(T’). Consequently, 1(5431200--- ,7’) = (5432100--- ,T). Observe that 


these two objects have opposite sign, but both have weight 2t°riazariz9. 


11.68. Example. Let us compute J(v°,T), where 


vp = 5432100---, T= 


Moving beads on v® according to the word w(T) = 352235112445233111, bead 3 bumps into 
bead 2 when we have scanned the suffix 3111 (which is the shortest suffix without partition 
content). We therefore modify the 2’s and 3’s in the word as follows: 


352235112445233111 
3.223 2 233 
CC D DEC 

* OK 
CC D CCC 
2332 3 £222 
253325113445222111 


Therefore [(v°, T) = (v’, T’), where 


y' = 54231, T’ = 


Observe that wt(v°,T) = wt(v’,T’) = ciasa8rir3, sen(v’,T’) = —sgn(v°,T), and 
LCL FE v= eT), 

11.69. Example. Let us compute ae when » = (5,4,4,1), v = (3,1), and p = (4,4, 2). 
We draw the semistandard tableaux of shape A/v whose words have the required suffix 
property. The following two tableaux are the only ones, so e. w= 2: 


Let us see how these tableaux correspond to fixed points of J when N = 5. The first tableau 
changes the standard abacus v? = (5432100---) to the abacus (54003002100---) counted 
by LAbc(y + 6(5)) by moving bead 1 twice, then bead 2 once, then bead 1 twice, and 
so on. Permuting the labels gives the other 119 signed objects that make up one copy of 
Qy45(5)(21,---,25); for instance, 


3425100--- , +  (34002005100---). 
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On the other hand, the second tableau changes the standard abacus (5432100---) to the 
abacus (54003002100: --) via a different sequence of collision-free bead moves: move bead 
1 twice, then bead 2 twice, then bead 1 once, and so on. This pair and its permutations 
produce another copy of the generating function a,,45(5)(@1,..., 25). Dividing by a5(5), we 
conclude that 

8y Jp = 28, +-°° 


Now let us compute c> ,. The required skew tableaux, which have shape (5, 4, 4, 1)/(4, 4, 2) 


Hy" 


and content (3,1), are: 
r 


So Ae = 2. This illustrates the general symmetry property cy, = Ci which is true but 
not immediately evident from our combinatorial description of these coefficients. We will 


prove this property later when we discuss products of Schur polynomials. 


11.70. Example. For N > 7, let us find the Schur expansion of the skew Schur polynomial 
$(3,3,2,2)/(2,1) in N variables. This expansion is found by enumerating all semistandard skew 
tableaux of shape (3, 3, 2,2) /(2, 1) satisfying the required suffix property. Each such tableau 
of content 4 contributes one term s, to the expansion. The relevant tableaux are shown 
here: 


We conclude that 


§(3,3,2,2)/(2,1) = 18(3,3,1) + 15(3,2,2) + 18(3,2,1,1) + 18(2,2,2,1)- 


11.17 Products of Schur Polynomials 


Given partitions . € Pary(m) and v € Parn(n), the product s,(v1,...,¢N)S)(21,...,2) 
is a symmetric polynomial, so it can be expressed uniquely in terms of Schur polynomials 
8 (@1,...,@n) indexed by A € Pary(m +n): 


$30 = S- al(A, 1, V)s8 (a(A, u,v) € Q). (11.7) 
r 


Now, 11.57 shows that the coefficients here are precisely the Littlewood-Richardson num- 
bers: 
a(A, LM, v) = (SuSv, 8) = (Su, 8y/v) a come 


Since 5,5, = 8,8, (because multiplication of polynomials is commutative), we deduce the 
symmetry 
r r 


Cu = Cpu 


We now derive another combinatorial expression for these integers by viewing the prod- 
uct $,,8, as a skew Schur polynomial. We claim that s,s, = 8./g, where 


a= (1 +1, ja + ¥2,...,f1 + UN, M1,-.-,KN), B= (uy). 
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This follows since the skew shape a/{ consists of two disconnected pieces, one of shape v 
and one of shape yp. A semistandard skew tableau of this shape can be formed by choosing 
a semistandard tableau of shape v and independently choosing a semistandard tableau of 
shape py; thus the result follows from the product rule for weighted sets. We conclude that 


es = CB (with a, 3 as above). 


This formula is illustrated in the next example. 


11.71. Example. Let us compute the Schur expansion of 82,1) 8(2,1) using the observation 
(2,1) 8(2,1) = 8(4,3,2,1)/(2,2)- The following skew tableaux have words such that all suffixes 
have partition content: 


Looking at contents, we conclude that 


§(2,1) $(2,1) = $(4,2) + §(4,1,1) + 28(3,2,1) + 83,3) + $(2,2,2) + 8(2,2,1,1) + $(3,1,1,1)- 


Observe that the upper-right portion of the skew tableau could only be filled in one way. So 
we could ignore this part of the tableau and just consider suitable fillings of the lower shape. 
Generalizing this remark leads to the following prescription for the Littlewood-Richardson 
coefficients. 


11.72. Theorem: Alternate Formula for Littlewood-Richardson Coefficients. For 
all partitions A,,v € Pary, the coefficent Cry = e. is the number of semistandard 
tableaux T of shape yp and content A-—v = (A; —-H% : 1 < i < N) such that w(T) = 
W +++ Wn Satisfies the following condition: for all k < n, the exponent vector of the monomial 
ee x’ [Tj tw; is a partition. (This condition means that for all k <n and j < N, if 
there are a copies of j and 6 copies of j7+1 in the suffix wpwe41 +++ Wn, then vj; +a > V;414+0.) 


Proof. We already know that Cy = cj,, where the skew shape a/B consists of an upper 
part of shape v and a lower part of shape p. We also know that cB y 1s the number of skew 
tableaux U of shape a/@ and content A such that every suffix of w(U) has partition content. 
Consider the last |v| symbols in w(U). The last letter is the label in the rightmost cell of 
the first row of the skew shape a/Z. The partition content condition forces this letter to be 
1, and then all letters in the first row of the skew tableau U must be 1. The letter at the 
end of the second row must be strictly greater than 1, so it is 2 (by the partition content 
condition), and then every entry in the second row must be 2. Proceeding in this way, we see 
that for k < N, every entry in row k of U must equal k. Equivalently, the last |v| symbols 
of w(U) must be N’X ---2”21"1. Call this suffix z. 

Now we must fill the lower part of the shape a/@ by choosing a semistandard tableau 
T of shape py. Because the upper part of U has content v, the content of the entire skew 
tableau U will be iff the content of the lower part T is \—v. The other condition imposed 
on T is that, for every suffix y of w(T), the suffix yz of w(U) has partition content. Given 
the formula for z above, this condition is equivalent to the condition on w(T) in the theorem 
statement. O 
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TABLE 11.1 


Formulas for manipulating antisymmetric and symmetric polynomials. 


Pieri rules: 


Determinant formula for s): 


Schur expansion of power-sums: 


Power-sum expansion of Schur polys.: 


Formulas for skew Schur polynomials: 


Inverse Kostka formulas: 


(KK,= >) sen(S)) 


SESRHT(u,\) 


A491 
446(N)Pk = St sgn(R)ag+6); 
B: B/X is a k-ribbon R 
A)4+6(N)Ck = 4646(N)3 
B: B/X is a vertical k-strip 
Ay+5(N) RE = SS 4g45(N)3 
B: B/X is a horizontal k-strip 
5\Dk = S- sgn(R)sg; 


B: B/X is a k-ribbon R 
Xr 
Suba= Ds, XA 8x 
Xr 


Ai+tN-i 
antsy _ det [apr ln<i jen 


as(N) det |larY"la<i,g<n 
Pa = by x28 
Xr 
oN 
_y Xe 
Ss, = Zn Pu 


mn 
sau = ps 
Xr > ’ 
/p ee v 
(isha f) _ (8; Sif) for f € An; 
W(Sr/p = SQ / pls 
8r/p = det ||ha,—p;+5-allr<ig<e(ayi 
Sr/p = det lle pi 4j—illisigca 
Sr/p = a Ch 83 
Seay wt yess 
MeV Xr pou 
my = ae, BY $y; 
su = Par FFG 


a / 
Su! = oy Ky ,€a- 


Summary 


Table 11.1 contains formulas derived in this chapter for computing with antisymmetric and 


symmetric polynomials. 


e Unlabeled Abaci. An abacus is a function w : Z — {0,1} with w(z) = 1 for all small 
enough i and w(j) = 0 for all large enough j. Justification of abaci gives a bijection to 
pairs (m, A) € Z~x Par. The inverse bijection can be computed by traversing the frontier 
of dg(A), converting north steps to beads (1’s) and east steps to gaps (0’s), and using 
m to decide which step on the frontier corresponds to position 0 of w. 


e Jacobi Triple Product Identity. Abaci can be used to prove 


S- Gere ie = 


meZ 


[[ G+”) [[ a+) [J a- 4”. 


n>1 


n>0 n>1 
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One consequence is the formula [T,,.,(1— 9") = Sega lge kr PF 


e k-cores and k-quotients. Given a partition jz, repeated removal of border ribbons of size 
k (in any order) will lead to a unique partition from which no further ribbons of this kind 
can be removed. This partition is called the k-core of 4. We can also find the k-core by 
converting ys to an abacus, decimating the abacus to give a k-runner abacus, justifying all 
runners, and converting back to a partition. Each ribbon removal corresponds to moving 
one bead one step to the left on the k-runner abacus. Justifying each separate runner on 
the k-runner abacus for js produces the k-quotients (v°,...,v*~+) of yw. Alternatively, 
dg(v’) can be found by taking the cells of dg() lying due north and due west of steps 
of k-content i on the frontier of jz. We get a bijection A, : Par > Core(k) x Par* by 
mapping p to its k-core and k-quotients. 


e Labeled Abaci and Antisymmetric Polynomials. A polynomial f in N variables is anti- 
symmetric iff interchanging any two adjacent variables changes the sign of f. For each 
= (11 > po > +++ > wn > 0), the polynomial a,(21,...,2w) = det ||a4"||1<i,3<N 
is antisymmetric. Writing 6(N) = (N —1,N — 2,...,2,1,0), the set {a+5(n) 
 € Pary} is a basis for the space Ay of antisymmetric polynomials. Division by 
asin) = []escj<n (i — 23) gives a vector space isomorphism from Ay to Ay sending 
4+45(N) to the Schur polynomial s, = @)45(N)/@5(w). To model the terms in a)+5(y), 
we use the N! labeled abaci consisting of beads 1,2,...,.N (in any order) at positions 
given by \ + 6(NV). 


e Rim-Hook Tableaux. A rim-hook tableau of shape A/j and content a is obtained by 
enlarging the diagram of y using border ribbons of lengths ai, a2,... (in this order) 
until the diagram of A is obtained. The set of such tableaux is denoted RHT(A/p, a). 
A ribbon occupying r rows has sign (—1)"~?, and the sign of a rim-hook tableau is the 
product of the signs of its ribbons. We write yl - UU TERHT(A/t,0) sgn(T'). We have 


x/ a Ne “ whenever sort(a) = sort((). 


e Interactions between Abaci and Tableauz. One can give combinatorial proofs of several 
identities in Table 11.1 by using the word of a tableau to encode bead motions on abaci. 
When these motions lead to bead collisions, one obtains two objects of opposite sign 
and equal weight that cancel terms on one side of the formula to be proved. Objects 
with no collisions are fixed points that can be reorganized to give the other side of the 
formula. 


e Inverse Kostka Matrix. A rim-hook tableau is called special iff each ribbon in the tableau 
begins in the leftmost column; SRHT(, A) is the set of such tableaux of shape 4 and 


content a with sort(a@) = A. Letting K4 ,, = i sesrur(y,r) 88n(S), we have KK’ = I. 


e Littlewood-Richardson Coefficients. The scalars ey n= Bs appearing in the Schur ex- 
pansions of sy/, and s,s, count semistandard tableaux T of shape A/v and content ps 
such that every suffix of the word of T has partition content. The scalars on ,, also count 
semistandard tableaux T of shape yw and content 4 — v such that w(T) = wi--: wp 
satisfies the following condition: for all k <n, the exponent vector of [J i eta: 


is a partition. 
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[eee 
Exercises 

11.73. Let w =---1101101110101001100---. Compute wt(w) and J(w). 

11.74. Compute U(—1, 1) for each pw € Par(5). 


11.75. In the computation of U(m, u) in 11.4, describe in detail how to use m to decide 
which symbol in the bead-gap sequence is wo. 


11.76. Given js € Par, what is the relationship between the abaci U(—1, js) and U(-1, yp’)? 


11.77. Show that the abacus w in 11.2 and its justification J(w) have the same weight, if 
we use the weights defined in the proof of 11.5. 


11.78. Show how to deduce Euler’s pentagonal number theorem as an algebraic consequence 
of the Jacobi triple product identity. 


11.79. Fill in all the details in the proof of 11.6. 


11.80. Use 11.5 to simplify the product [J,,.)(1—-2°"*1) “TI, s9( —2°"**)~! appearing 
in one of the Rogers-Ramanujan identities. Can you give a direct proof of the resulting 
identity using abaci? 


11.81. Use 11.4 to find a bijective proof of 11.5 that makes no reference to abaci, instead 
using combinatorial operations on partition diagrams and their frontiers. 


11.82. Complete the proof of 11.12 by verifying that D,(w) € Abc”, I,(v) € Abc, and D, 
and J; are two-sided inverses. 


11.83. (a) Verify that the 3-core of = (10,10, 10,8,8,8,7,4) is (1,1) by removing border 
3-ribbons from y in several different orders. (b) Use the 3-runner abacus encoding ps to 
determine exactly how many ways there are to change p into (1,1) by removing an ordered 
sequence of border 3-ribbons. 


11.84. Let uw = (8,7,6,4,4,4,3,1,1,1). Use abaci to compute the k-core and k-quotients 
of wforl<k<6. 


11.85. Find all integer partitions that are 2-cores, and draw some of their diagrams. 
11.86. Find all 3-cores with at most 8 cells. 
11.87. Verify the assertion in the last sentence of 11.20. 


11.88. Let uw = (8,8, 8,8, 8,8, 8,8). (a) Use abaci to compute the k-core and k-quotients of 
p for 3<k <8. (b) Use the construction at the end of §11.4 to compute the k-quotients of 
us (for 3 < k < 8) directly from the diagram of pu. 


11.89. Compute the k-quotients of w = (6,6,6,3,3,2,2,2,1,1) without using abaci, for 
k= 3,4,5. 


11.90. Consider the construction at the end of 811.4 for computing k-quotients of 4. Show 
that the hook-length of each unerased cell is divisible by k, and these are the only cells in 
the diagram of js whose hook-lengths are divisible by k. 


11.91. For each k > 1, find a formula for the generating function ruecere(e) qt, 
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11.92. Given that js has k-core p and k-quotients v°,...,v*~', find a formula for the number 
of ways we can go from yz to p by removing an ordered sequence of border k-ribbons. 


11.93. Compute a,,(x1, £2,173) and ay45(3) (11, 72, %3) for = (6,3,1) and A = (2, 2,1). 
11.94. Verify by direct calculation that, for N = 3 and \ = (2,1,0), a@,+5(.y) is divisible by 


asin) and ay45()/@s(N) = $\(@1,---, LN). 


11.95. Verify that Ay and A}, are subspaces of K[x1,...,¢y], and that the map f > 
fas(n) (for f € An) is K-linear. 


11.96. (a) Show that the product of two antisymmetric polynomials is symmetric. (b) 
Show that the product of a symmetric polynomial and an antisymmetric polynomial is 
antisymmetric. 


11.97. Define a map T: K[1,...,0n] > K[a1,...,2n] by setting 


1 
T(f)= Wi S* sgn(w) f(@w(1),-+ +> %w(N)): 


weSn 


Show that T is a K-linear map with image Ay whose restriction to Ay is the identity map. 
Can you describe the kernel of T’? 


11.98. Let v be the labeled abacus v = 0041000300502600---. Compute wt(v), w(v), 
pos(v), and sgn(v). For which \ is v in LAbc(A + 6(6))? 


11.99. Draw all the labeled abaci in LAbc(6, 5,1), and compute the sign of each abacus. 


11.100. Using N = 6 variables, compute: 


(a) Q(4,2,1)+6(6)P45 (b) Q(3,3,3)+6(6)P3; (c) Q(1,1,1,1,1)+6(6)P2- 
How would the answers change if we changed N? 


11.101. Let v = 0310040206500--- € LAbc(A + 6(6)) and k = 4. For 1 <i < 6, compute 


I(v,i) where J is the involution in the proof of 11.39. For any fixed points that arise, compute 
v* and indicate which border 4-ribbon is added to dg(A) in the passage from v to v*. 


11.102. Using N = 6 variables, compute: 

(a) @(4,2,1)+6(6)€35 (b) @(3,3,3)+6(6) €23 (c) @(5,4,3,1,1)+5(6) €4- 

How would the answers change if we changed N? 

11.103. Let v = 0310040206500--- € LAbc(A + 6(6)). Compute I(v, S) for S = {2,5,6}, 
S = {1,4,5}, S = {1,3,4}, and S = {3, 4,6}, where J is the involution in the proof of 11.42. 
For any fixed points that arise, compute v* and indicate which vertical strip is added to 
dg(A) in the passage from v to v*. 


11.104. Using N = 5 variables, compute: 


(a) a(4,2,1)48(6) 23; (b) a(3,3,3)+5(6)h3; (C) @(5,4,3,1,1) +66) 24- 
How would the answers change if we changed N? 


11.105. Let v = 0310040206500: -- € LAbc(A+6(6)). Compute I(v, M) for M = [1,1, 4,5], 
M = [2,2,5,6], M = [2,4,5,5], and M = [1,2,3,4], where J is the involution in the proof 
of 11.44. For any fixed points that arise, compute v* and indicate which horizontal strip is 
added to dg(A) in the passage from v to v*. 


11.106. Explain in detail why the bead motion rule in §11.9 leads to the addition of a 
horizontal k-strip to the shape A, assuming no bead collision occurs. 
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11.107. In the proof of 11.44, check in detail that I reverses signs, preserves weights, and 
is an involution. 


11.108. Reprove 11.45 by comparing the symmetric and antisymmetric Pieri rules for 
multiplication by e,. 

11.109. Expand the following symmetric polynomials into linear combinations of Schur 
polynomials: (a) $(3,3.2)p3; (b) pr3,1,3); (¢) $(2,2)P(2,1)- 

11.110. Compute the coefficients of the following Schur polynomials in the Schur expansion 
of p(3,3,2,1)! (a) 8/9); (b) 8(3,3,3)} (€) $(4,4,1)3 (d) sqay). 


11.111. Show that, for A € Par(n), Xn) =|SYT())|. 
11.112. Write s(3.2,1) as a linear combination of power-sum polynomials. 


11.113. For each y € Par(4), write p, in terms of Schur polynomials, and write s,, in terms 
of power-sum polynomials. 


11.114. Let J be the involution in §11.12. For each (v,T) € X given below, compute 
I(v,T). If (v,T) is a fixed point, compute v* € Y. 

(a) v = 5432100---, T= 

(b) v = 2431500---, T=[al1h|_. 


(c) v = 3452100---, T=/2}atay 


11.115. Let IJ, X, and Y be defined as in §11.12. Take N = 3 and = (2,1,0). List all the 
elements of X and Y, compute the action of J on X, and show how the fixed points of I 
map bijectively to Y. 


11.116. Verify all the assertions stated just before 11.54. 
11.117. Express s(4.3.1)/(2,1) aS a linear combination of power-sums. 
11.118. Explain why the formulas w(h,,) =e, and w(e,) = h, are special cases of 11.59. 


11.119. For N > k, two linear operators S$ and T on AX, are called adjoint iff (S(f),g) = 
(f,T(g)) for all f,g € AX. Prove that this condition holds for all such f,g iff it holds for 
all f in some basis of A‘, and all g in some (possibly different) basis of A‘. 


11.120. Write the following Schur polynomials in terms of the complete symmetric poly- 
nomials h,,: (a) §(5,3)3 (b) §(4,1,1)3 (c) §(5,5,2,2)- 


11.121. Write the following Schur polynomials in terms of the elementary symmetric poly- 
nomials €,: (a) $(2,2,2,2); (b) $(3,2,1)3 (©) $(4,2)- 


11.122. Write the skew Schur polynomial s,4,4,3)/(2,1,1) in terms of: (a) the h,,.’s; (b) the 
€,,’8; (c) the p,’s; (d) the m,,’s. 


11.123. Modify the definition of the involution used in the proof of 11.60 as follows. If two 
or more paths in (w,pi,...,Pn) intersect, choose i minimal and then j minimal such that 
p; and p; intersect. Let (u,v) be the earliest vertex on p; that is also a vertex of p,, and 
switch the initial segments of these two paths as in the original proof. Show that the map 
just defined is not always an involution. 
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11.124. Can you find a way to rephrase the combinatorial proof of 11.60 in terms of abaci? 


11.125. Enumerate special rim-hook tableaux to compute Bos for all partitions A, with 
at most 4 cells. Use this to confirm by direct calculation that KK’ = I. 


11.126. Find and prove a Pieri-type rule giving the Schur expansion of a product s,m). 


11.127. Let K’ be the matrix defined combinatorially by Ky, = Ssesrurr(ys,a) 882(5). 
Find involutions that prove KK’ = I. 


11.128. Let K’ be the inverse Kostka matrix, defined using special rim-hook tableaux. Can 
you prove the identity K’K = I combinatorially? 


11.129. Let I be the involution in the proof of 11.66. (a) Compute [(v°, 7’), where 


Ug = 5432100: -- , 


(b) Answer (a) if the last 1 in the top row of T is changed to a 2. (c) Answer (a) if the last 
3 in row 2 of T is changed to a 2. 


11.130. Compute ey and ae using 11.66, where: (a) \ = (5,3,1,1), uw = (3,1), v = 
(3, 2, 1); (b) A — (5, 4, 4, 3, 1); LU — (4, 3, 3, 1), ho (3, 1, Td) 


11.131. Repeat the previous exercise, but use 11.72 to compute the Littlewood-Richardson 
coefficients. 


11.132. Continuing 11.69, find the expansion of 8(5 44,1) /(3,1) Into a sum of Schur polyno- 
mials. 


11.133. Expand the following skew Schur polynomials into sums of Schur polynomials: (a) 
8(3,3,3)/(2,1); (b) 8(5,4)/(2)3 (€) $(4,3,2,1)/(1,1,1)- 


11.134. Expand 8:3 9)8 (2,2) into a sum of Schur polynomials. 


11.135. In the Schur expansion of 8/3,9,1,1) find the coefficients of: (a) s(5,4,2,2,1); (b) 


§(5,3,3,1,1,1)3 (c) §(4,3,3,2,1,1)° 


11.136. Give a combinatorial proof of 11.72 based on abaci. 


Eee 


Notes 


The proof of the Jacobi triple product identity in §11.2 is adapted from a lecture of Richard 
Borcherds. One source for material on unlabeled abaci, k-cores, and k-quotients is the book 
by James and Kerber [72]; for labeled abaci, see Loehr [84]. Gessel and Viennot [53] have 
used intersecting lattice path models to prove many enumeration results. The combinatorial 
interpretation of the inverse Kostka matrix is due to Egecioglu and Remmel [33]. The proof 
of the Littlewood-Richardson rule given in $11.16 may be viewed as a combinatorialization 
of the algebraic proof in Remmel and Shimozono [111]. Many other proofs of this rule may 
be found in the literature; see, e.g., the bibliographic notes in Fulton [46] and Stanley [127, 
Ch. 7]. 
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Additional Topics 


This chapter covers a variety of topics illustrating different aspects of enumerative combi- 
natorics and probability. The treatment of each topic is essentially self-contained. 


aT 


12.1 Cyclic Shifting of Paths 


This section illustrates another technique for enumerating certain collections of lattice paths. 
The basic idea is to introduce an equivalence relation on paths by cyclically shifting the 
steps of a path. A similar idea was used in §3.14 to enumerate lists of terms. 


12.1. Theorem: Enumeration of Rational-Slope Dyck Paths. Let r and s be positive 
integers such that gcd(r,s) = 1. The number of lattice paths from (0,0) to (r,s) that never 
go below the diagonal line sx = ry is 
1 r+s 
rt+s\ rs) 


(Such paths are called r/s-Dyck paths.) 


Proof. Step 1. Let X = R(E"N*), which is the set of all rearrangements of r copies of E and 
s copies of N. Thinking of E as an east step and N as a north step, we see that X can be 
identified with the set of all lattice paths from (0,0) to (r,s). Given v = v1U2-+++Upzs € X, 
we define an associated label vector L(v) = (mo,m1,...,™Mr+s) as follows. We set mo = 0. 
Then we recursively calculate m; = mj_1 +7 if uy, = N, m; = m;_, —s if vu; = E. For 
example, if r= 5, s = 3, and v = NEEENENE, then L(v) = (0,5,2,—-1, —4, 1, —2,3,0). We 
can also describe this construction in terms of the lattice path encoded by v. If we label each 
lattice point (x,y) on this path by the integer ry — sx, then L(v) is the sequence of labels 
encountered as we traverse the path from (0,0) to (r,s). This construction is illustrated 
by the lattice paths in Figure 12.1. Note that v is recoverable from L(v), since v; = N iff 
mi — m1 =7T and vu; = E iff mj; — mj_-1 = —s. 

Step 2. We prove that for all v € X, if L(v) = (mo, m1,...,Mp4s5) then mo, m1, ..., 
Mrpts—1 are distinct, whereas m4; = 0 = mg. To see this, suppose there exist x, y, a,b with 
0<a<rand0<06< 5s, such that (x,y) and (1 +a,y+ b) are two points on the lattice 
path for v that have the same label. This means that ry — sx = r(y + 6) — s(a +a), which 
simplifies to rb = sa. Thus the number rb = sa is a common multiple of r and s. Since 
gcd(r, s) = 1, we have lem(r, s) = rs, so that rb > rs and sa > rs. Thus b> s anda>r, 
forcing b = s anda=r. But then (x,y) must be (0,0) and (2 + a,y +6) must be (7, s). So 
the only two points on the path with equal labels are (0,0) and (r,s), which correspond to 
mo and Mr+s5. 

Step 3. Introduce an equivalence relation ~ on X by setting v ~ w iff v is a cyclic 
shift of w. More precisely, defining C(w1w2---wr+s) = W2W3+++Wr+sW1, we have v ~ w iff 
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5/3—Dyck path 


0 


FIGURE 12.1 
Cyclic shifts of a lattice path. 


v = C"(w) for some integer i (which can be chosen in the range 0 < i < r+s). For each 
v € X, let [v] = {w © X : w~ v} be the equivalence class of v relative to this equivalence 
relation. Figure 12.1 shows the paths in the class [NEEENENEF]. 

Step 4. We show that |[v]] = r+s for all v © X, which means that all r+s cyclic shifts of 
v are distinct. Suppose v = v1 U2-++ Up4s has L(v) = (mo,m1,..., Mr+s). By definition of L, 
for each i with 0 < i < r+s, the label vector of the cyclic shift C*(v) = vita: ++ Upts¥1 +++ Ui 
is 

L(C*(v)) = (0, Mit — M4, M42 — Mi,---,Mrts — Mi, M1 — My, . 2. TM — mj) 


(cf. Figure 12.1). The set of integers appearing in the label vector L(C*(v)) is therefore 
obtained from the set of integers in L(v) by subtracting m; from each integer in the latter 
set. In particular, if jz is the smallest integer in L(v), then the smallest integer in L(C*(v)) 
is 4 — m,. Since the numbers mg, m1,...,7Mr4+s—1 are distinct (by step 2), we see that the 
minimum elements in the sequences L(C*(v)) are distinct, as i ranges from 0 to r+ s—1. 
This implies that the sequences L(C*(v)), and hence the words C*(v), are pairwise distinct. 

Step 5. We show that, for all v € X, there exists a unique word w € [v] such that w 
encodes a rational-slope Dyck path. By the way we defined the labels, w is an r/s-Dyck 
path iff L(w) has no negative entries. Recall from step 4 that the set of labels in L(C*(v)) is 
obtained from the set of labels in L(v) by subtracting m,; from each label in the latter set. 
By step 2, there is a unique 7 in the range 0 <i <r-+s such that m; = py, the minimum 
value in L(v). For this choice of i, we have m; > ps = m, for every j, so that m; — m; > 0 
and L(C’(v)) has no negative labels. For any other choice of i, m; > wu by step 4, so that 
L(C*(v)) contains the negative label pz — mj. 

Step 6. Suppose ~ has n equivalence classes in X. By step 5, n is also the number of 
rational-slope Dyck paths. By step 4, each equivalence class has size r + s. Since X is the 
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mn) . (n+1,n) 


FIGURE 12.2 
Comparing Dyck paths to (n + 1)/n-Dyck paths. 


disjoint union of its equivalence classes, the sum rule and 1.46 give 


ae =|X|=n(r+s). 


r,s 
Dividing by r+ gives the formula stated in the theorem. O 


12.2. Corollary: Enumeration of Dyck Paths and m-Dyck Paths. For n > 1, the 
number of Dyck paths ending at (n,7) is 


1 2n+1 
Qn+1\n+1,n) 


For m,n > 1, the number of m-Dyck paths ending at (mn, 1n) is 


1 (m+1)n+1 
(m+1)n4+1 ( mn+il1,n ) , 
Proof. Let X be the set of Dyck paths ending at (n,n), and let X’ be the set of i +1)/n- 
Dyck paths ending at (n+1,n). Since gcd(n + 1,n) = 1, we know that |X’| = SoT Coe 
On the other hand, passing from the diagonal y = x to the line (n + 1)y — nx = 0 does 
not introduce any new lattice points in the region of interest, except for (n + 1,n). See 
Figure 12.2. It follows that appending a final east step gives a bijection from X onto X’, 
so the first result holds. The second result is proved in the same way: appending a final 
east step gives a bijection from the set of m-Dyck paths ending at (mn,n) to the set of 
(mn + 1)/n-Dyck paths ending at (mn + 1,n). Oo 


DS 


12.2. Chung-Feller Theorem 


In 81.10, we defined Dyck paths and proved that the number of Dyck paths of order n is 


the Catalan number C,, = wat (;” pay This section discusses a remarkable generalization of 


this result called the Chung-Feller Theorem. 
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12.3. Definition: Flawed Paths. Suppose a = ((29, yo),---, (®2n; Y2n)) is a lattice path 
from (0,0) to (n,n). For 1 < 7 <n, we say that 7 has a flaw in row j iff there exists a point 
(xj, Ys) visited by 7 such that y; = 7-1, yj < v;, and (241, yi+1) = (2, y; +1). This means 
that the jth north step of a occurs in the region southeast of the diagonal line y = x. For 
1<j<n, define 

X;(7) = x(a has a flaw in row j). 


Also define the number of flaws of a by setting flaw(7) = Xi (7) + Xo(m) +--+ + Xn(7). 


For example, the paths shown in Figure 12.3 have zero and six flaws, respectively. The 
paths shown in Figure 12.4 have five and zero flaws, respectively. Observe that 7 is a Dyck 
path iff flaw(a) = 0. 


12.4. Chung-Feller Theorem. Fix n > 0, and let A be the set of lattice paths from (0,0) 
to (n,n). ForO<k <n, let 


Ay = {a € A: flaw(7) =k}. 


Then |A;| =|Ao| for all k. In particular, forO <k <n, 


1 1 2 
Ar] = 4 iio 


n+1 ~ nati n,n 


Proof. Fix k > 0. To prove that |Ao| = |Ax|, we define a bijection ¢, : Ap — Ax. See 
Figure 12.3 for an example where n = 10 and k = 6. Given a Dyck path 7 € Ao, we begin 
by drawing the line y = k superimposed on the Dyck path. There is a unique point (2;, y;) 
on 7 such that y; = k and 7 arrives at (2;,y;) by taking a vertical step. Call this step the 
special vertical step. Let (a1,a2,...) € {H,V}?"~*'—¥% be the ordered sequence of steps of 
m reading northeast from (2;, y;), where H means “horizontal step” and V means “vertical 
step.” Let (bp = V, bi, b2,...) € {H, V}*'*% be the ordered sequence of steps of 7 reading 
southwest from (x;, y;). For the Dyck path shown on the left in Figure 12.3, we have 


We compute $4 (7) = c1c2°+-Can € {V, H}?" as follows. Let ce; = a1, co = ae, etc., until 
we obtain a horizontal step c,(= ax) that ends strictly below the diagonal y = x. Then set 
Ceti = b1, Ce+2 = bg, etc., until we obtain a vertical step Ch+m(= bm) that ends on the line 
y = a. Then set cppem+1 = Ak+1, Ckht+m+2 = Ak+2, etc., until we take a horizontal step that 
ends strictly below y = x. Then switch back to using the steps b,,+1,... until we return to 
y = x. Continue in this way until all steps are used. By convention, the special vertical step 
bo = V is the last “b-step” to be consumed. 

For example, for the path 7 in Figure 12.3, we have labeled the steps of 7 as A through 
T for ease of reference. The special vertical step is step I. We begin by transferring steps 
J,K,L,M,N to the image path (starting at the origin). Step N goes below the diagonal, so 
we jump to the section of 7 prior to the special vertical step and work southwest. After 
taking only one step (step H), we have returned to the diagonal. Now we jump back to 
our previous location in the top part of 7 and take step O. This again takes us below the 
diagonal, so we jump back to the bottom part of 7 and transfer steps G,F,E,D,C. Now we 
return to the top part and transfer steps P,Q,R,S,T. Finally, we return to the bottom part 
of 7 and transfer steps B, A, and finally the special vertical step I. 

This construction has the following crucial property. Vertical steps above the line y = k 
in 7 get transferred to vertical steps above the line y = x in ¢,(7), while vertical steps 
below the line y = k in a get transferred to vertical steps below the line y = x in ¢x(z). 
Thus, ¢,(7) has exactly k flaws, and is therefore an element of Ax. 
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FIGURE 12.3 
Mapping Dyck paths to flawed paths. 


Moreover, consider the coordinates of the special point (x;, y;). By definition, y; = k = 
flaw($z(7)). On the other hand, we claim that y; — x; equals the number of horizontal steps 
in ¢,(7) that start on y = x and end to the right of y = x. Each such horizontal step 
corresponds to a step after (x;,y;) in 7 that brings the path closer to the main diagonal 
y =x. For instance, these steps are N, O, and T in Figure 12.3. The definition of 6, shows 
that the steps in question (in 7) are the earliest east steps after (#;,y;) that arrive on the 
lines y= a+d for d= y; — 2; —1,...,2,1,0. The number of such steps is therefore y; — x; 
as claimed. 

The observations in the last paragraph allow us to compute the inverse map ¢}, : A, > 
Ag. For, suppose 7 € A, is a path with k flaws. We can recover (2;, y;) since y; = k and 
yi — x; is the number of east steps of 7 departing from y = x. Next, we transfer the steps 
of 7 to the top and bottom portions of ¢;,(a) by reversing the process described earlier. 
Figure 12.4 gives an example where n = 10 and k = 5. First we find the special point 
(a;,Yi) = (2,5). We start by transferring the initial steps A,B,C of 7 to the part of the 
image path starting at (2,5) and moving northeast. Since C goes below the diagonal in 7, 
we now switch to the bottom part of the image path. The special vertical step must be 
skipped, so we work southwest from (2,4). We transfer steps D,E,F,G,H. Since H returns to 
y = in 7, we then switch back to the top part of the image path. We only get to transfer 
one step (step I) before returning to the bottom part of the image path. We transfer step J, 
then move back to the top part and transfer steps K through S. Finally, step T is transferred 
to become the special vertical step from (2, 4) to (2,5). One checks that ¢), is the two-sided 
inverse of @%, So dy : Ag > Ax is a bijection. 

Now that we know |A;| = |Ao| for all k, the final statement of the theorem follows. For 
A is the disjoint union of the n + 1 sets Ao, A1,...,An, all of which have cardinality | Aol. 
By the sum rule, 

|A] = |Ao| + [Ai] +--+ + [An] = (n + 1)[Aol, 


|A| 1 2n 
A A — =Un 
al dl Se a a 


and therefore 


forO<k<n. O 
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y=x 


FIGURE 12.4 
Mapping flawed paths to Dyck paths. 


In probabilistic language, the Chung-Feller Theorem can be stated as follows. 


12.5. Corollary. Suppose we pick a random lattice path a from the origin to (n,n). The 


number of flaws in this path is uniformly distributed on {0,1,2,...,n}. In other words, 
1 
Pew) =) Ss for k =0,1,...,n. 


Proof. We compute 


i _ tin) _ 
P(flaw(7) =k) = AE ~(a) ee Oo 


12.6. Remark. The Chung-Feller Theorem is significant in probability theory for the fol- 
lowing reason. One of the most celebrated theorems of probability is the central limit theo- 
rem. Roughly speaking, this theorem says that the sum of a large number of independent, 
identically distributed random variables (suitably normalized) will converge to a normal dis- 
tribution. The normal distribution is described by the “bell curve” that appears ubiquitously 
in probability and statistics. One often deals with situations involving random variables that 
are not identically distributed and are not independent of one another. One might hope 
that a generalization of the central limit theorem would still hold in such situations. 

Chung and Feller used the example of flawed lattice paths to show that such a general- 
ization is not always possible. Fix n > 0, and let the sample space S consist of all lattice 
paths from the origin to (n,n). Given a lattice path 7 € S, recall that 


flaw(m) = Xy(7) + Xo(7) +--+ + Xn(7), 


where X;(7) = x(a has a flaw in row j). The random variables X1, X2,...,Xn are iden- 
tically distributed; in fact, one can show that P(X; = 0) = 1/2 = P(X; = 1) for 
all 7 (see 12.96). But we have seen that the sum of these random variables, namely 
X, + Xg+---+ X, = flaw, is uniformly distributed on {0,1,2,...,n} for every n. A 
uniform distribution is about as far as one can get from a normal distribution! The trouble 
is that the random variables X1,..., Xn are not independent. 
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12.3. Rook-Equivalence of Ferrers Boards 


This section continues the investigation of rook theory begun in §2.11. We define the notion 
of a rook polynomial for a Ferrers board and derive a characterization of when two Ferrers 
boards have the same rook polynomial. 


12.7. Definition: Ferrers Boards and Rook Polynomials. Let pp = (1 > 2 >--- > 
Hts > 0) be an integer partition of n. The Ferrers board F,, is a diagram consisting of s 
left-justified rows of squares with j4; squares in row i. A non-attacking placement of k rooks 
on F), is a subset of k squares in F,, such that no two squares lie in the same row or column. 
Let rz(u) be the number of non-attacking placements of k rooks on F,,. The rook polynomial 
of pu is 

Rulw) = Sore (u)e*. 


k>0 


12.8. Example. If » = (4,1,1,1), then R,,(x) = 92? + 7x +41. To see this, note that there 
is one empty subset of F,, (which is a non-attacking placement of zero rooks). We can place 
one rook on any of the 7 squares in F),, so the coefficient of x’ in R, is 7. To place two 
non-attacking rooks, we place one rook in the first column but not in the first row (3 ways), 
and we place the second rook in the first row but not in the first column (3 ways). The 
product rule gives 9 as the coefficient of x? in R,,. It is impossible to place three or more 
non-attacking rooks on F),, so all higher coefficients in R,, are zero. 


As seen in the previous example, the constant term in any rook polynomial is 1, whereas 
the linear coefficient of a rook polynomial is the number |j:| of squares on the board F,,. 
Furthermore, R,,(x) has degree at most min(/11, @(w)), since all rooks must be placed in 
distinct rows and columns of the board. 

It is possible for two different partitions to have the same rook polynomial. For example, 
one may check that 


Rea) (@) = 2a? + 4a +1 = Rey (#) = Rey(2). 
More generally, R,,(x) = R, (x) for any partition pu. 


12.9. Definition: Rook-Equivalence. We say that two integer partitions and v are 
rook-equivalent iff they have the same rook polynomial, which means r;,(~) = rg (v) for all 
k>0. 


A necessary condition for js and v to be rook equivalent is that || = |v|. The next 
theorem gives an easily tested necessary and sufficient criterion for deciding whether two 
partitions are rook-equivalent. 


12.10. Theorem: Rook-Equivalence of Ferrers Boards. Suppose p and v are parti- 


tions of n. Write uw = (f41 > +--+: > un) and vy = (4% >... > vp) by adding zero parts if 
necessary. The rook polynomials R,,(z) and R,(x) are equal iff the multisets 


[Ha +1, p02 +2,...,un +7] and [vy +1,v2+2,...,m4+n] 
are equal. 


Proof. The idea of the proof is to use the falling factorial basis {(x)|,: n > 0} for the vector 
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space of polynomials in x instead of the monomial basis {x : n > 0} (see 2.76). For any 
partition A, define 


Rh (2) = Soran (A)(@)le= S_ tne (A)a(@ — 1)-+-(e@ — +1). 
k=0 k=0 
Note that R, (x) = R,(x) iff r,(u) = rg (v) for 0 < k < n (by linear independence of the 
monomial basis) iff Ri,(x) = R_(z) in R[a] (by linear independence of the falling factorial 
basis). We will prove that this last condition holds iff the multisets mentioned in the theorem 
are equal. 

We now use rook combinatorics to derive a formula for R/,(). Fix a positive integer x. 
Consider the extended board F;,(x), which has py; + % squares in row i, for 1 <i <n. We 
obtain F,,() from the board F,, by adding x new squares on the left end of each of the n 
rows. Let us count the number of placements of n non-attacking rooks on F,,(x). On one 
hand, we can build such a placement by working up the rows from bottom to top, placing 
a rook in a valid column of each successive row. By the product rule, the number of valid 
placements is 


(@ + Hn)(@ + Hn 1) +(e + wr ~ (n= 1) =] (@ + [as - (r - 4). 


On the other hand, let us count the number of placements of n non-attacking rooks on 
F,(x) that have exactly k rooks on the original board F,,. We can place these rooks first in 
rp(u) ways. The remaining n — k rooks must go in the remaining n — k unused rows in one 
of the leftmost x squares. Placing these rooks one at a time, we obtain rz (js)a(a@ — 1)(a — 
2)---(a —(n—k—1)) valid placements. Adding over k gives the identity 

n 


Yo re(H)@)ln—e= []@ + [es — (rn - 6). 


n 
k=0 i=l 
Replacing k by n — k in the summation, we find that 
n 


R,(2) = []@ + lu - (n-))). 


i=1 


This polynomial identity holds for infinitely many values of x (namely, for each positive 
integer x), so the identity must hold in the polynomial ring R[z]. Similarly, 


n 


R, (2) = [[ (e+ - (n—-4)). 


i=1 


The proof is now completed by invoking the uniqueness of prime factorizations for one- 
variable polynomials with real coefficients. More precisely, note that we have exhibited 
factorizations of R/,(x) and R,,(x) into products of linear factors. These two monic polyno- 
mials are equal iff their linear factors (counting multiplicities) are the same, which holds iff 
the multisets 

[wi —(n—t):1<i<njand [y4-(n—-it):1<i<n] 


are the same. Adding n to everything, this is equivalent to the multiset equality in the 
theorem statement. O 


12.11. Example. The partitions (2,2,0,0) and (3,1,0,0) are rook-equivalent, because 
[3,4,3,4] = [4,3,3,4]. The partitions (4,2,1) and (5,2) are not rook-equivalent, since 
(5,4, 4,4, 5, 6,7] # [6, 4,3, 4,5, 6, 7]. 
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12.4 Parking Functions 


This section illustrates the use of a probabilistic argument to enumerate a collection of 
combinatorial objects, namely the parking functions defined next. 


12.12. Definition: Parking Functions. A parking function of order n is a function 
f :{1,2,...,n} — {1,2,...,n} such that 


l{a: f(a) <i} >% forl<i<n. 


12.13. Example. For n = 8, the function f defined in Figure 12.5 is a parking function, 
but the function g is not. 


FIGURE 12.5 
A parking function and a non-parking function. 


The name “parking function” arises as follows. Consider a one-way street with n parking 
spaces numbered 1, 2,...,n. Cars numbered 1,2,...,” arrive at the beginning of this street 
in numerical order. Each car wants to park in its own preferred spot on the street. We 
encode these parking preferences by a function h: {1,2,...,n} — {1,2,...,n}, by letting 
h(a) be the parking spot preferred by car x. Given h, the cars park in the following way. 
For « = 1,2,...,m, car x arrives and drives forward along the street to the spot A(x). If 
that spot is empty, car x parks there. Otherwise, the car continues to drive forward on the 
one-way street and parks in the first available spot after h(a), if any. The cars cannot return 
to the beginning of the street, so it is possible that not every car will be able to park. 

For example, suppose the parking preferences are given by the parking function f defined 
in Figure 12.5. Car 1 arrives first and parks in spot 2. The next two cars arrive and park in 
spots 6 and 3, respectively. When car 4 arrives, spots 2 and 3 are full, so car 4 parks in spot 
4. This process continues. At the end, every car has parked successfully, and the parking 
order is 8, 1,3, 4, 6,2, 5,7. Now suppose the parking preferences are given by the non-parking 
function g defined in Figure 12.5. After the first six cars have arrived, the parking spots on 
the street are filled as follows: 

3,6,—,—,1, 2,4, 5. 


Car 7 arrives and drives to spot g(7) = 7. Since spots 7 and 8 are both full at this point, 
car 7 cannot park. 


12.14. Theorem. A function h: {1,2,...,n}— {1,2,...,n} is a parking function iff every 
car is able to park using the parking preferences determined by h. 


Proof. We prove the contrapositive in each direction. Suppose first that h is not a parking 
function. Then there exists i < n such that |{x : h(x) < i}| <7. This means that fewer than 
i cars prefer to park in the first 2 spots. But then the first 7 spots cannot all be used, since 
a car never parks in a spot prior to the spot it prefers. Since there are n cars and n spots, 
the existence of an unused spot implies that not every car was able to park. 

Conversely, assume not every car can park. Let i be the earliest spot that is not taken 
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spot 3 


spot 2 


cars 


ROUNDABOUT 
spot n+1 


FIGURE 12.6 
Parking on a roundabout. 


after every car has attempted to park. Then no car preferred spot i. Suppose 7 or more 
cars preferred the first i — 1 spots. Not all of these cars can park in the first i — 1 spots. 
But then one of these cars would have parked in spot 7, a contradiction. We conclude that 
{x : h(a) < i}| <7, so that h is not a parking function. Oo 


12.15. Theorem: Enumeration of Parking Functions. There are (n+ 1)"~! parking 
functions of order n. 


Proof. Fix n > 0. Define a circular parking function of order n to be any function f : 
{1,2,...,n} > {1,2,...,n +1}. Let Z be the set of all such functions; we know that 
|Z| = (1+ 1)”. We interpret circular parking functions as follows. Imagine a roundabout 
(circular street) with n + 1 parking spots numbered 1,2,...,2+ 1. See Figure 12.6. As 
before, f encodes the parking preferences of n cars that wish to park on the roundabout. 
Thus, forl <x<nand1l<y<n-+1, y= f(a) iff car x prefers to park in spot y. Cars 
1,2,...,n arrive at the roundabout in increasing order. Each car x enters just before spot 
1, then drives around to spot f(x) and parks there if possible. If spot f(x) is full, car x 
keeps driving around the roundabout and parks in the first empty spot that it encounters. 

No matter what f is, every car will succeed in parking in the circular situation. Moreover, 
since there are now n+ 1 spots and only n cars, there will always be one empty spot at the 
end. Suppose we randomly select a circular parking function. Because of the symmetry of 
the roundabout, each of the n+ 1 parking spaces is equally likely to be the empty one. (The 
fact that the entrance to the roundabout is at spot 1 is irrelevant here, since for parking 
purposes we may as well assume that car x enters the roundabout at its preferred spot 
f(x).) Thus, the probability that spot k is empty is wT for 1 <k <n+1. On the other 
hand, spot n + 1 will be the empty spot iff f is a parking function of order n. For, if spot 
n+ 1 is empty, then no car preferred spot n + 1, and no car passed spot n + 1 during the 
parking process. Thus, the circular parking process on the roundabout coincides with the 
original parking process on the one-way street (and conversely). Since spot n + 1 is empty 
with probability 1/(n +1) and the sample space Z has size (n + 1)”, we conclude that the 
number of ordinary parking functions must be |Z|/(n + 1) = (n+1)""1. Oo 


12.16. Remark. Let A, be the set of circular parking functions of order n with empty 
spot k. The preceding proof shows that |An| = (n + 1)"~! for 1 < k < n+1. We 


Additional Topics 507 


established this counting result by a probabilistic argument, using symmetry to deduce 
that P(A,,x) = 1/(n +1) for all k. This symmetry property is intuitively evident, but it 
can also be proved rigorously as follows. Suppose f € A,,, and kp are given. Let ¢(f) be 
the function iH f(t) + kg — ky mod (n +1) for 1 <7 <n, taking the remainder to lie in 
{1,2,...,n. +1}. One may check that @¢ defines a bijection from A,,,, onto An... These 
bijections prove that all the sets Ay, (for 1 < k <n-+1) have the same cardinality. 


12.17. Remark. One of the early motivations for studying parking functions was their 
connection to hashing protocols. In computing applications, one often stores information in 
a data structure called a hash table. We consider a simplified model where n items are to 
be stored in a linear array of n cells. A hash function h : {1,2,...,n} — {1,2,...,n} is 
used to determine where each item will be stored. We store item 7 in position h(i), unless 
that position has already been taken by a previous item — this circumstance is called a 
collision. We handle collisions via the following collision resolution policy: if h(i) is full, we 
store item 7 in the earliest position after position 7 that is not yet full (if any). If there is no 
such position, the collision resolution fails (we do not allow “wraparound” ). This scenario 
is exactly like that of the cars parking on a one-way street according to the preferences 
encoded by h. Thus, we will be able to store all n items in the hash table iff h is a parking 
function. 


DS Ee 


12.5 Parking Functions and Trees 


We can use parking functions (§12.4) to give a bijective proof of Cayley’s formula 3.72 for 
the number of n-vertex trees. The proof involves labeled lattice paths, which we now define. 


12.18. Definition: Labeled Lattice Paths. A labeled lattice path consists of a lattice 
path 7 from (0,0) to (a,b), together with a labeling of the 6 north steps of 7 with labels 
1,2,...,6 (each used exactly once) such that the labels for the north steps in a given column 
increase from bottom to top. 


We can illustrate a labeled lattice path by drawing z inside an (a + 1) x 6 grid of unit 
squares and placing the label of each north step in the unit square to the right of that north 
step. For example, Figure 12.7 displays a labeled lattice path ending at (5,7). 


(a,b) 


FIGURE 12.7 
A labeled lattice path. 
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(0,0) 


FIGURE 12.8 
Converting a function to a labeled path. 


12.19. Theorem: Enumeration of Labeled Paths. There are (a + 1)? labeled lattice 
paths from (0,0) to (a,b). 


Proof. It suffices to construct a bijection between the set of labeled lattice paths ending at 
(a,b) and the set of all functions f : {1,2,...,b} > {1,2,...,@+1}. Given a labeled lattice 
path P, define the associated function by setting f(z) = 7 for all labels ¢ in column j of P. 

The inverse map acts as follows. Given a function f : {1,2,...,b} > {1,2,...,a@+ 1}, 
let S; = {a : f(a) = i} and s; = |S;| for 1 < i < a+ 1. The labeled path associated to f 
is the lattice path N°!EN*E.---N*+! where the ith string of consecutive north steps is 
labeled by the elements of S; in increasing order. O 


12.20. Example. The function associated to the labeled path P in Figure 12.7 is given by 


LR 2, 24, 31, 46, 5 4, 6 4, TRI. 
Going the other way, the function f : {1,2,...,7}— {1,2,...,6} defined by 


fl) = 2, (2) =5, F(3) = 4, F(4) =2, F(5) =4, F(6) =2, F(7) =2, 
is mapped to the labeled lattice path shown in Figure 12.8. 


A labeled Dyck path of order n is a Dyck path ending at (n,n) that is labeled according 
to the rules in 12.18. For example, Figure 12.9 displays the sixteen labeled Dyck paths of 
order 3. 


12.21. Theorem: Enumeration of Labeled Dyck Paths. There are (n+1)"~+ labeled 
Dyck paths of order n. 


Proof. Using the bijection in 12.19, we can regard labeled lattice paths from (0,0) to (n,n) 
as functions f : {1,2,...,n} — {1,2,...,n+1}. We first show that non-Dyck labeled paths 
correspond to non-parking functions under this bijection. A labeled path P is not a Dyck 
path iff some east step of P goes from (i— 1,7) to (i,j) for some i > j. This condition holds 
for P iff the function f associated to P satisfies |{a : f(x) < i}| = 7 for some i > j. In turn, 
this condition on f is equivalent to the existence of i such that |{x : f(x) < i}| <7. But this 
means that f is not a parking function (see 12.12). It now follows that labeled Dyck paths 
are in bijective correspondence with parking functions. So the result follows from 12.15. O 


12.22. Cayley’s Theorem via Parking Functions. There are (n + 1)"~! trees with 
vertex set {0,1,2,...,m}. 
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FIGURE 12.9 
Labeled Dyck paths. 


Proof. In light of the previous result, it suffices to define bijections between the set B of 
labeled Dyck paths of order n and the set C of all trees with vertex set {0,1,2,...,n}. 
To define f : B — C, let 7 be a labeled Dyck path of order n. Let (a1, a2,...,@n) be the 
sequence of labels in the diagram of 7, reading from the bottom row to the top row, and 
set do = 0. Define a graph T = f(z) as follows. For 0 < j <n, there is an edge in T from 
vertex a; to each vertex whose label appears in column j + 1 of the diagram of 7. These are 
all the edges of T. Using the fact that a is a labeled Dyck path, one proves by induction 
on j that every a; is either 0 or appears to the left of column j + 1 in the diagram, so that 
every vertex in column j + 1 of 7 is reachable from vertex 0 in T. Thus, T = f(z) isa 
connected graph with n edges and n+ 1 vertices, so T is a tree by 3.71. 


12.23. Example. Figure 12.10 shows a parking function f, the labeled Dyck path 7 cor- 
responding to f, and the tree T = f(z). We can use the figure to compute the edges of T 
by writing a; underneath column j + 1, for 0 < j < n. If we regard zero as the ancestor of 
all other vertices, then the labels in column j + 1 are the children of vertex aj. 


Continuing the proof, we define the inverse map f’: C — B. Let T € C be a tree with 
vertex set {0,1,2,...,n}. We generate the diagram for f’(T) by inserting labels into an 
n x n grid from bottom to top. Denote these labels by (a1,...,@n), and set agp = 0. The 
labels a,,@2,... in column 1 are the vertices of T adjacent to vertex ag = 0 (written in 
increasing order from bottom to top). The labels in the second column are the neighbors of 
a, other than vertex 0. The labels in the third column are the neighbors of az not in the 
set {ao,a1}. In general, the labels in column j + 1 are the neighbors of a; not in the set 
{a@o,41,...,@;-1}. Observe that we do not know the full sequence (a1,...,an) in advance, 
but we reconstruct this sequence as we go along. We will show momentarily that a; is always 
known by the time we reach column j + 1. 
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FIGURE 12.10 
Mapping parking functions to labeled Dyck paths to trees. 
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FIGURE 12.11 
Mapping trees to labeled Dyck paths to parking functions. 
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12.24. Example. Figure 12.11 shows a tree T, the labeled Dyck path 7 = f’(T), and the 
parking function associated to 7. 


Let us check that f’ is well defined. We break up the computation of f’(T) into stages, 
where stage 7 consists of choosing the increasing sequence of labels a; < --+ < a, that occur 
in column 7. We claim that at each stage 7 with 1 < 7 < n+ 1, aj-1 has already been 
computed, so that the labels entered in column j occur in rows 7 or higher. This will show 
that the algorithm for computing f’ is well defined and produces a labeled Dyck path. We 
proceed by induction on j. The claim holds for 7 = 1, since ap = 0 by definition. Assume 
that 1 < 7 <n+1 and that the claim holds for all j’ < j. To get a contradiction, assume 
that a;_, is not known when we reach column j. Since the claim holds for 7 — 1, we must 
have already recovered the labels in the set W = {ao = 0,a1,...,a;—-2}, which are precisely 
the labels that occur in the first 7 — 1 columns. Let z be a vertex of T not in W. Since T 
is a tree, there is a path from 0 to z in T’.. Let y be the earliest vertex on this path not in 
W, and let x be the vertex just before y on the path. By choice of y, we have x € W, so 
that « = 0 or = ax for some k < j — 2. But if « = 0, then y occurs in column 1 and 
hence y € W. And if a = ax, then the algorithm for f’ would have placed y in column 
k+1<j-1, and again y € W. These contradictions show that the claim holds for 7. It is 
now routine to check that f’ is the two-sided inverse of f. oO 


12.6 Mobius Inversion and Field Theory 


This section gives two applications of the material in §4.7 to field theory. We show that 
every finite subgroup of the multiplicative group of any field must be cyclic; and we count 
the number of irreducible polynomials of a given degree with coefficients in a given finite 
field. The starting point for proving the first result is the relation n = >> dive ¢(d), proved 
in 4.34. We begin by giving a combinatorial interpretation of this identity in terms of the 
orders of elements in a cyclic group of size n. 


12.25. Theorem: Order of Elements in a Cyclic Group. Suppose G is a cyclic group 
of size d < co, written multiplicatively. If c © G generates G and c > 1, then x° generates 
a cyclic subgroup of G of order d/ gcd(c, d) = lem(c, d)/c. 


Proof. Since (a°) C G, the order of «© must be finite. Let & be the order of «°. We have 
seen in 9.79 that k is the smallest positive integer such that «°* = 1g, and that the k 
elements x°, 17°, ..., 2** are distinct and constitute the cyclic subgroup of G generated 
by «°. Since x has order d, we know from 9.79 that x” = 1 iff d/m. It follows from this 
and the definition of & that kc is the least positive multiple of c that is also a multiple of 
d. In other words, kc = lem(c,d). It follows that the order of 2° is k = lem(c,d)/c. Since 
cd = lem(c, d) gced(c, d), we also have k = d/ gcd(c, d). Oo 


12.26. Theorem: Counting Generators in a Cyclic Group. If G is a cyclic group of 
size d < oo, then there are exactly ¢(d) elements in G that generate G. 


Proof. Let x be a fixed generator of G. By 9.79, the d distinct elements of G are 


x',a?,...,27 = 1g. By 12.25, the element 2° generates all of G iff gcd(c,d) = 1. By 
the definition of ¢ (see 4.19), the number of such integers c between 1 and d is precisely 
o(d). Oo 


12.27. Theorem: Subgroup Structure of Cyclic Groups. Let G be a cyclic group 
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of size n < oo. For each d dividing n, there exists exactly one subgroup of G of size d, and 
this subgroup is cyclic. 


Proof. We only sketch the proof, which uses some results about group homomorphisms that 
were stated as exercises in Chapter 9. We know from 9.59 that every subgroup of the cyclic 
group Z has the form kZ for some unique k > 0, and is therefore cyclic. Next, any finite 
cyclic group G can be viewed as the quotient group Z/nZ for some n > 1. This follows by 
applying the fundamental homomorphism theorem 9.207 to the map from Z to G sending 
1 to a generator of G. By the correspondence theorem 9.211, each subgroup H of G has the 
form H = mZ/nZ for some subgroup mZ of Z containing nZ. Now, mZ contains nZ iff mn, 
and in this case |mZ/nZ| = n/m. It follows that there is a bijection between the positive 
divisors of n and the subgroups of G. Each such subgroup is the homomorphic image of a 
cyclic group mZ, so each subgroup of G is cyclic. O 


Suppose G is cyclic of size n. For each d|n, let Ga be the unique (cyclic) subgroup of 
G of size d. On one hand, each element y of G generates exactly one of the subgroups Gq 
(namely, y generates the group Gq such that d is the order of y). On the other hand, we 
have shown that Ga has exactly ¢(d) generators. Invoking the sum rule, we obtain a new 


proof of the fact that 
n= S- 6(d). 


12.28. Theorem: Detecting Cyclic Groups. If G is a group of size n such that for 
each d dividing n, G has at most one subgroup of size d, then G is cyclic. 


Proof. For each d dividing n, let Tz be the set of elements in G of order d. G is the disjoint 
union of the sets Ty by 9.119. Consider a fixed choice of d such that Tg is nonempty. Then G 
has an element of order d, hence has a cyclic subgroup of size d. By assumption, this is the 
only subgroup of G of size d, and we know this subgroup has ¢(d) generators. Therefore, 
|Ta| = o(d) whenever |Tq| 4 0. We conclude that 


n=|Gl)= >> |Tal < So ¢(d) =n. 


d|n d|n 


Since the extreme ends of this calculation both equal n, the middle inequality here must 
in fact be an equality. This is only possible if every Ty is nonempty. In particular, T;, is 
nonempty. Therefore, G is cyclic, since it is generated by each of the elements in T),. oO 


12.29. Theorem: Multiplicative Subgroups of Fields. Let F be any field, possibly 
infinite. If G is a finite subgroup of the multiplicative group of F’, then G is cyclic. 


Proof. Suppose G is a subgroup of F* (the multiplicative group of nonzero elements in F’) 
such that |G| = n < co. By 12.28, it suffices to show that G has at most one subgroup of 
size d, for each d|n. If not, let H and K be two distinct subgroups of G of size d. Then 
AUK is a set with at least d+ 1 elements; and for each z € HU K, it follows from 9.119 
that z is a root of the polynomial x7 —1 in F. But any polynomial of degree d over F has at 
most d distinct roots in the field F’ (see 2.157). This contradiction completes the proof. O 


Our next goal is to count irreducible polynomials of a given degree over a finite field. We 
shall assume a number of results from field theory, whose proofs may be found in Chapter 
V of the algebra text by Hungerford [70]. Let F' be a finite field with q elements. It is 
known that q must be a prime power, say g = p*®, and F is uniquely determined (up to 
isomorphism) by its cardinality q. Every finite field F' with q = p® elements is a splitting 
field for the polynomial x2? — x over Z/pZ. 
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12.30. Theorem: Enumeration of Irreducible Polynomials. Let F be a field with 
q = p* elements. For each n > 1, let I(n,q) be the number of monic irreducible polynomials 
of degree n in the polynomial ring F[a]. Then 


q” =) _dI(d,q) 


d|n 


and hence 


I(n,q) = - do u(n/d). 
d\n 


Proof. The strategy of the proof is to classify the elements in a finite field K of size g” based 
on their minimal polynomials. From field theory, we know that each element u € K is the 
root of a uniquely determined monic, irreducible polynomial in F'[a] (called the minimal 
polynomial of u over F). The degree d of this minimal polynomial is d = [F'(u) : F], where 
for any field extension E C H, [H : E] denotes the dimension of H viewed as a vector space 
over £. It is known that n = [K : F] = [K : F(u)|-[F(u) : F], so that djn. Conversely, 
given any divisor d of n, we claim that every irreducible polynomial of degree d in F'[a]| has 
d distinct roots in K. Sketch of proof: Suppose g is such a polynomial and z 4 0 is a root 
of g in a splitting field of g over K. Since z lies in F(z), which is a field with q? elements, it 
follows from 9.119 (applied to the multiplicative group F(z)*) that 24°—! = 1. One checks 
that gq? — 1 divides gq” — 1 (since d\n), so that z7°~! = 1, and hence z is a root of x” — 2. 
It follows that every root z of g actually lies in K (which is a splitting field for x%” — 2). 
Furthermore, since z is a root of 2%” — , it follows that the minimal polynomial for z over 
F (namely g) divides x?” — x in F[x]. We conclude that g divides x1" — x in K[z] also. The 
polynomial x2!” — x is known to split into a product of q” distinct linear factors over K; 
in fact, 2?” —a2 = I1.,ex(# — 0). By unique factorization in the polynomial ring K[z], g 
must also be a product of d distinct linear factors. This completes the proof of the claim. 

We can now write K as the disjoint union of sets R, indexed by all irreducible polyno- 
mials in Fa] whose degrees divide n, where R, consists of the deg(g) distinct roots of g in 
kK. Invoking the sum rule and grouping together terms indexed by polynomials of degree d 
dividing n, we obtain 


q@=|Ki= >> |Rgl= >) deg(g) => al(d,¢). 


irreducible g irreducible g d|n 
deg(g)|n deg(g)|n 
We can now apply the Mobius inversion formula 4.30 to the functions f(n) = q” and 
g(n) = nI(n, q) to obtain 
nl (n,q) = Sg" u(n/d). ial 


d|n 


12.7 Quantum Binomial Coefficients and Subspaces 
Recall (§6.7) that the quantum binomial coefficients are the polynomials in N[a] defined by 
the formula 2 : 
psec Tes(e!- 1) 
o (Allein FI!e TT @* - DTT @- 


We gave a number of combinatorial interpretations of these polynomials in $6.7. In this 


section, we discuss a linear-algebraic interpretation of the integers alg? where q is a prime 
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power. To read this section, the reader should have some previous experience with fields 
and vector spaces. We begin by using bases to determine the possible sizes of vector spaces 
over finite fields. 


12.31. Theorem: Size of Vector Spaces over Finite Fields. Suppose V is a d- 
dimensional vector space over a finite field F with q elements. Then |V| = q/. 


Proof. Let (v1,...,va) be an ordered basis for V. By definition of a basis, for each vu € V, 


there exists exactly one d-tuple of scalars (c1,...,¢a) € F@ such that v = c,v, +e2v2 +++ + 
cava- In other words, there is a bijection v + (c1,...,¢a) from V to F%. Because |F'| = q, 
the product rule gives |V| = |F4| = |F|¢ = q?. oO 


12.32. Theorem: Size of Finite Fields. If K is a finite field, then |K| = p* for some 
prime p and some e > 1. 


Proof. Given K, let F' be the cyclic subgroup of the additive group (K,+) generated by 
1x. The size of F is some finite number p (since K is finite), and p > 1 since 1x 4 Ox. 
We know that p is the smallest positive integer such that plx = 0x. One checks (using the 
distributive laws) that F’ is a subring of K, not just a subgroup. If p were not prime, say 
p = ab with 1 < a,b < p, then (alx)- (blk) = able = ple = Ok, and yet alx,blxe £0. 
This contradicts the fact that fields have no zero divisors. Thus, p must be prime. It now 
follows that F is a field isomorphic to the field of integers modulo p. K can be regarded as 
a vector space over its subfield F’, by defining scalar multiplication F x kK — K to be the 
restriction of the multiplication K x kK — K in the field. Since K is finite, it must be a 
finite-dimensional vector space over F’. Thus the desired result follows from 12.31. O 


12.33. Remark. One can show that, for every prime power p*°, there exists a finite field of 
size p©, which is unique up to isomorphism. The existence proof is sketched in 12.126. 


We now give the promised linear-algebraic interpretation of quantum binomial coeffi- 
cients. 


12.34. Theorem. Let K be a finite field with q elements. For all integers n > 0 and 
O0<k<n, lil is the number of k-dimensional subspaces of any n-dimensional vector 


space V over K. 


Proof. Let f(n,k,q) be the number of k-dimensional subspaces of V. (One can check that 
this number depends only on k, g, and n = dim(V).) Recall from 12.31 that |V| = q” and 
each d-dimensional subspace of V has size q¢. By rearranging factors in the defining formula 
for Lele? we see that el = f(n,k,q) holds iff 


f(r, kaya — 1)(g** — 1)-+(gh 1) = (G" — 1)" — 1) — 1). 


We establish this equality by the following counting argument. Let S be the set of all 
ordered lists (v1,...,v%) of & linearly independent vectors in V. Here is one way to build 
such a list. First, choose a nonzero vector v, € V in any of gq” — 1 ways. This vector spans a 
one-dimensional subspace W of V of size q = q!. Second, choose a vector vg € V ~ W, in 
any of g” —q ways. The list (v1, v2) must be linearly independent since v2 is not in the space 
W, spanned by v,. Vectors vy and vz span a two-dimensional subspace W2 of V of size q?. 
Third, choose v3 € V ~ W2 in gq” — q? ways. Continue similarly. When choosing v;, we have 
already found i — 1 linearly independent vectors v1,...,v;-1 that span a subspace of V of 
size q’~ +. Consequently, (v1,...,v;) will be linearly independent iff we choose vu; € V ~ Wi, 
which is a set of size gq” — q’'. By the product rule, we conclude that 


k k 
Is] = [[@ -@) = TJ ttt — 9) = 92 qn — y@r? = 1). gr = 1). 


i=l i=l 
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Now let us count S in a different way. Observe that the vectors in each list (v1,...,v,%) € S 
span some k-dimensional subspace of V. So we can begin by choosing such a subspace W 
in any of f(n,k,q) ways. Next we choose v1,...,v% € W one at a time, following the same 
process used in the first part of the proof. We can choose v; in |W|—1 = q* —1 ways, then 
v2 in g® — q ways, and so on. By the product rule, 


k 
IS] = f(r, &,q) |] (a* - a") = F(n, &, gah -P? (g* — 1g? -1)--- (q" - 1). 


j=1 
Equating the two formulas for |,$| and cancelling q*(*—))/? gives the desired result. | 


In 6.36, we saw that 


Ze 


pe P(kyn—k) 


where P(k,n —k) is the set of all integer partitions u that fit in a k x (n—k) rectangle. In 
the rest of this section, we shall give a second proof of 12.34 by showing that 


f(inkg= YS) dé 


pe P(kyn—k) 


This proof is longer than the one just given, but it reveals a close connection between enu- 
meration of subspaces on one hand, and enumeration of partitions in a box (or, equivalently, 
lattice paths) on the other hand. 

For convenience, we shall work with the vector space V = K” whose elements are n- 
tuples of elements of K. We regard elements of V as row vectors of length n. The key 
linear-algebraic fact we need is that every k-dimensional subspace of V = kK” has a unique 
“reduced row-echelon form basis.” 


12.35. Definition: Reduced Row-Echelon Form. Let A be a k x n matrix with entries 
in K. Let A,,..., Ax € K” be the k rows of A. We say A is a reduced row-echelon form 
(RREF) matrix iff the following conditions hold: (i) A; # 0 for all 7, and the leftmost 
nonzero entry of A; is 1x (call these entries leading ones); (ii) if the leading one of A; 
occurs in column j(7), then j(1) < j(2) < --- < g(k); (iii) every leading one is the only 
nonzero entry in its column. An ordered basis B = (v,..., vx) for a k-dimensional subspace 
of K” is called a RREF basis iff the matrix whose rows are v1,...,v% is a RREF matrix. 


12.36. Theorem: RREF Bases. Let K be any field. Every k-dimensional subspace of 
kk” has a unique RREF basis. Conversely, the rows of every k x n RREF matrix comprise 
an ordered basis for a k-dimensional subspace of kK”. Consequently, there is a bijection 
between the set of such subspaces and the set of k x n RREF matrices with entries in Kk. 


Proof. We sketch the proof, trusting the reader’s ability to supply the remaining linear 
algebra details. Step 1: We use row-reduction to show that any given k-dimensional subspace 
W of K” has at least one RREF basis. Start with any ordered basis v1,..., vx of W, and let 
A be the matrix with rows v,,..., Ux. There are three “elementary row operations” we can 
use to simplify A: interchange two rows; multiply one row by a nonzero scalar; add any scalar 
multiple of one row to a different row. A routine verification shows that performing any one 
of these operations has no effect on the subspace spanned by the rows of A. Therefore, we 
can create new ordered bases for W by performing sequences of row operations on A. Using 
the well-known Gaussian elimination algorithm (“row reduction” ), we can bring the matrix 
A into reduced row-echelon form. The rows of the new matrix give the desired RREF basis 
of W. 
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Step 2: We show that a given subspace W has at most one RREF basis. Use induction on 
k, the base case k = 0 being immediate. For the induction step, assume n > 1 and k > 1 are 
fixed, and the uniqueness result is known for smaller values of k. Let A and B be two RREF 
matrices whose rows form bases of W; we must prove A = B. Let j(1) < j(2) < +--+ < j(k) 
be the positions of the leading ones in A, and let r(1) <--- < r(k) be the positions of the 
leading ones in B. If j(1) < r(1), then the first row of A (which is a vector in W) has a 1 in 
position j(1). This vector cannot possibly be a linear combination of the rows of B, all of 
whose nonzero entries occur in columns after j(1). Thus, j(1) < r(1) is impossible. A similar 
argument rules out r(1) < j(1), so we must have j(1) = r(1). Let W’ be the subspace of 
W consisting of vectors with zeroes in positions 1,2,...,7(1). Consideration of leading ones 
shows that rows 2 through k of A must form a basis for W’, and rows 2 through k of B 
also form a basis for W’. Since dim(W’) = k — 1, the induction hypothesis implies that 
rows 2 through k of A equal the corresponding rows of B. In particular, we now know that 
r(t) = 9(t) for 1 <i <k. To finish, we must still check that row 1 of A equals row 1 of B. 
Let the rows of B be wy,..., wx, and write v, for the first row of A. Since v; € W, we have 
vy = aw, +--+ + axw,z for suitable scalars a,. Consideration of column j7(1) shows that 
a, = 1. On the other hand, if a; 4 0 for some i > 1, then ayw, +---+ a,w, would have a 
nonzero entry in position j(7), whereas v; has a zero entry in this position (since the leading 
ones occur in the same columns in A and B). This is a contradiction, so ag = --- = ax, = 0. 
Thus v, = wz, as desired, and we have now proved that A = B. 

Step 3: We show that the k rows v,,...,uz% of a given RREF matrix form an ordered 
basis for some k-dimensional subspace of kK”. It suffices to show that the rows in question 
are linearly independent vectors. Suppose cjv; + +--+ chug, = 0, where c; € K. Recall that 
the leading one in position (7, j(z)) is the only nonzero entry in its column. Therefore, taking 
the j(i)th component of the preceding equation, we get c; = 0 for 1 <i<k. 

O 


Because of the preceding theorem, the problem of counting k-dimensional subspaces of 
K” (where |K| = q) reduces to the problem of counting k x n RREF matrices with entries 
in kK. Our second proof of 12.34 will therefore be complete once we prove the following 
result. 


12.37. Theorem: Enumeration of RREF Matrices. Let K be a finite field with q 
elements. The number of k x n RREF matrices with entries in K is 


9 ante El, 


pe P(k,n—k 


Proof. Let us classify the k x n RREF matrices based on the columns j(1) < j(2) <---< 
j(k) where the leading ones occur. To build a RREF matrix with the leading ones in these 
positions, we must put zeroes in all matrix positions (i,p) such that p < j(i); we must 
also put zeroes in all matrix positions (r,j(i)) such that r < 7. However, in all the other 
positions to the right of the leading ones, there is no restriction on the elements that occur 
except that they must come from the field K of size g. How many such “free positions” 
are there? The first row contains n — j(1) entries after the leading one, but k — 1 of these 
entries are in columns above other leading ones. So there are 41 = n — j(1) — (k — 1) free 
positions in this row. The next row contains n — 7(2) entries after the leading one, but k — 2 
of these occur in columns above other leading ones. So there are 2g = n — j(2) — (k — 2) 
free positions in row 2. Similarly, there are pn, = n — j(t) — (K-71) =n—k+i-—j(t) free 
positions in row i for 1 < i < k. The condition 1 < j(1) < j(2) <--+ < j(k) < nis logically 
equivalent to 0 < j(1) -—1< j(2)-2<.---< j(k) —k<n-—k, which is in turn equivalent 
ton-—k > uw > po > +++ > px > 0. Thus there is a bijection between the set of valid 
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positions j(1) < j(2) <... < j(k) for the leading ones, and the set of integer partitions 
be = (f1,---, Me) that fit inak x (n—k) boa, given by wy, = n—k+i— (i). Furthermore, 
|u| = ur +-+++ py is the total number of free positions in each RREF matrix with leading 
ones in the positions j(i). Using the product rule to fill these free positions one at a time 
with elements of K, we see that there are q!“! RREF matrices with leading ones in the given 
positions. The theorem now follows from the sum rule, keeping in mind the bijection just 
constructed between j-sequences and partitions. O 


12.38. Example. To illustrate the preceding proof, take n = 10, k = 4, and consider 
RREF matrices of the form 


ooo °o 
oCooOorF 
ooo * 
ooo x 
oOoOrFroO 
oroeo°®e 
ox * * 
ox * * 
Be Oso 
x * *¥ * 


Here *’s mark the free positions in the matrix, and (j(1), j(2),7(3),7(4)) = (2,5,6,9). 
The associated partition is w = (5,3,3,1), which does fit in a 4 x 6 box. We can see the 
(reflected) diagram of this partition in the matrix by erasing the columns without stars and 
right-justifying the remaining columns. Evidently there are q!* = q'“! ways of completing 
this template to get an RREF matrix with the leading ones in the indicated positions. 

Going the other way, consider another partition uw = (6,2,2,0) that fits in a 4 x 6 box. 
Using the formula j(i) = n —k +%— pu, we recover (j(1), 7(2), 7(3), 7(4)) = (1,6, 7, 10), 
which tells us the locations of the leading ones. So this particular partition corresponds to 
RREF matrices that match the following template: 


E 


ia 


12.8 Tangent and Secant Numbers 


0 0 


nee) 
a 


oOoOC * 
oOoC * 
oOocC x 
ooOocC x 
ne i, Se 
ox * * 


1 0 
0 1 
0 0 


In calculus, one learns the following power series expansions for the trigonometric functions 
sine, cosine, and arctangent: 


3 5 7 5 Ae cs 
sing =a—2°/3!4+a°/5!—a'/7!4+---= ) (-1) 73 
= (2k + 1) 
oo oh 
ooe 1 =e 2 abana > (1) 
k=0 : 
oo qaktl 
arctang =a — 29/3 +0°/5— a" /74+---=) (1's 


-(n) 
These expansions are all special cases of Taylor’s formula f(x) = >>> 9 ig, Using 


Taylor’s formula, one can also find power series expansions for the tangent and secant 
functions: 
2 17 7 62 


re eee ie as 2 ihe xo+ wee pee it : 
eB 15 315 2835, | 155925" 
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secz =1+ i 2 + 3 A + ce. + os + Lee 
2 24 720 8064 3628800 
The coefficients of these series seem quite irregular and unpredictable compared to the 
preceding three series. Remarkably, as we shall see in this section, these coefficients encode 
the solution to a counting problem involving permutations. 

As in Chapter 7, we consider formal versions of the tangent and secant power series 
to avoid any questions of convergence. We define tana = sina/cosx and seca = 1/cosa, 
where sinz and cos are the formal series defined in 7.52. Now, for each n > 0, set ayn = 
(tan x)(”) (0) and b, = (sec x)" (0). The formal Maclaurin formula 7.55 asserts that 


= n - bn 
tang = y on am, sec © = s a". (12.1) 
nN. nmr 


n=0 n=0 — 


Since the ordinary Maclaurin series for the tangent and secant functions converge in a 
neighborhood of zero, the coefficients in the formal power series above match the coefficients 
in the ordinary power series representing the tangent and secant functions. The first several 
values of a,, and b,, are 


(an:n>0) = (0,1,0,2,0, 16, 0,272, 0, 7936, 0,353792,...); (12.2) 
(by) :n>0) = (1,0,1,0,5,0,61,0, 1385, 0,50521,...). 


One can check that a, = 0 for all even n and b,, = 0 for all odd n (cf. 7.161). 
Next, for each integer n > 0, let c, be the number of permutations w = w we--: Wy of 
{1,2,...,n} such that 


Wy < We > wW3 < Ws > ++ < Wn_-1 > Wn} (12.3) 


note that c, = 0 for all even n. For each integer n > 0, let d, be the number of permutations 
w of {1,2,...,n} (or any n-letter ordered alphabet) such that 


Wy <W2 > wW3 < Ws > +++ > Wn-1 < Wn} (12.4) 


note that d, = 0 for all odd n. By reversing the ordering of the letters, one sees that d», 
also counts the permutations w of n letters such that 


Wy > We < w3 > Wa <6 ++ < Wpn_1 > Wn. (12.5) 


Permutations of the form (12.3) or (12.4) are called up-down permutations. We aim to prove 
that a, = c, and b, = d, for all integers n > 0. The proof consists of five steps. 

Step 1. Using the formal derivative rules, one may show that (tan x)’ = sec? x (see 7.159). 
Differentiating the first series in (12.1), squaring the second series using 7.6, and equating 
the coefficients of x”, we obtain 


n 
An+1 br bn—k 


no — kl (n— k)! 


or equivalently, 
” nm 
An41 = ee (;) brbn—k- (12.6) 
k=0 


Step 2. We also have (sec 2)’ = tanasec a (see 7.159). Differentiating the second series 
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first k letters 1 last n-k letters 


FIGURE 12.12 
Counting up-down permutations of odd length. 


in (12.1), multiplying the two series together using 7.6, and equating the coefficients of x”, 
we obtain 


or equivalently, 
7 n 
bn+1 = ) @ anDn—k- (12.7) 


Step 3. We give a counting argument to prove the relation 


n 


Cnt = >~ @ Seda (12.8) 


k=0 


If n is odd, then both sides of this relation are zero, since at least one of k or n — k is odd 
for each k. Now suppose n is even. How can we build a typical permutation 


W= Wy < We > W3 << ++ > Wn41 


counted by cp,+1? Let us first choose the position of 1 in w; say we41 = 1 for some k between 
0 and n. The required inequalities at position k +1 will be satisfied if and only if & is even. 
Observe that in the case where k is odd, dydyn_~ = 0 so this term contributes nothing to the 
right side of (12.8). Given that k is even, choose a k-element subset A of the n remaining 
letters in es) ways. Use these letters to fill in the first k positions of w, subject to the 
required inequalities (12.4), in any of dy ways. Use the remaining letters to fill in the last 
(n+ 1) -—(k+1) =n-—k positions of w (subject to the inequalities (12.5), reindexed to 
begin at index k + 2), in any of d,_, ways. The desired relation now follows from the sum 
and product rules. See Figure 12.12, in which w is visualized as a sequence of line segments 
connecting the points (i,w;) for 1<i<n+l. 
Step 4. We give a counting argument to prove the relation 


n 


dni = >> @) Cid ie (12.9) 


k=0 
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n+1 


FIGURE 12.13 
Counting up-down permutations of even length. 


Both sides are zero if n is even. If n is odd, we must build a permutation 
W=W1 < W2 > W3 <6 << Wn41- 


First choose an index k with 0 < k < n, and define wz4, = n+1. This time, to get a nonzero 
contribution from this value of k, we need k to be odd. Now pick a k-element subset A of 
the n remaining letters. Use the letters in A to fill in wi, w2,..., Wr (ce ways), and use the 
remaining letters to fill in wey2,---,Wn+1 (dn—~ ways). See Figure 12.13. 

Step 5: A routine induction argument now shows that a, = c, and b, = d, for all 
n > 0, since the pair of sequences (a,,), (b,) satisfy the same system of recursions and initial 
conditions as the pair of sequences (c,), (d,). This completes the proof. 


DT 


12.9 Tournaments and the Vandermonde Determinant 
This section uses the combinatorics of tournaments to prove a famous determinant formula. 


12.39. Definition: Tournaments. An n-player tournament is a digraph t with vertex 
set [n] = {1,2,...,n} such that, for 1 <i <j <n, exactly one of the directed edges (i, j) 
or (j,2) is an edge of t. Let T;, be the set of all such tournaments. 


Intuitively, the n vertices represent n players who compete in a series of one-on-one 
matches. Each player plays every other player exactly once, and there are no ties. If player 7 
beats player j, the edge (i, 7) is part of the tournament; otherwise, the edge (j,7) is included. 


12.40. Definition: Weights, Inversions, and Sign for Tournaments. Suppose t € T), 


is a tournament. The weight of t is wt(t) = [T}_, ounseee tS) The inversion number of t is 


inv(t) = Vicicjen X((9,2) € t). The sign of t is sgn(t) = (—1)™®), 
Informally, wt(t) = a{'---a& iff player i beats e; other players for all i. If we think 


n 
of the numbers 1,2,...,n as giving the initial rankings of the players, inv(t) counts the 
number of times a lower-ranked player beats a higher-ranked one (with 1 being the highest 


rank). 
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12.41. Example. Consider the tournament t € Ts with edge set 

{(1,3), (1,4), (1,5), (2,1), (2,4), (8,2), (8,4), (8,5), 6,2), 6,4)}. 
We have wt(t) = x}r3x3x2, inv(t) = 4, and sgn(t) = +1. 


12.42. Theorem: Tournament Generating Function. For all n > 1, 


Y= sgn(t)wt(t)= |] (#i-2)). 


t€Tn 1<i<j<n 


Proof. We can build a tournament t € T;, by making a sequence of binary choices, indexed 
by the pairs 7 < j with 7,7 € [n]: for each i < j, pick either (7,7) or (j,i) and add this edge 
to t. Let us examine the effect of this choice on wt(t), inv(t), and sgn(t). If we add (i,j) 
to t (so i beats j), the exponent of x; goes up by 1, inversions go up by zero, and the sign 
is unchanged. If we add (j,i) to t instead, the exponent of 2; goes up by 1, inversions go 
up by one, and the sign is multiplied by —1. The generating function (+a; — x;) records 
the effect of this choice. The proof is completed by invoking the product rule for generating 
functions. O 


Given a tournament t, there may exist three players u,v, w where u beats v, v beats w, 
and w beats u. This situation occurs whenever the digraph ¢ contains a directed 3-cycle. 
Let us give a name to tournaments where this circularity condition does not occur. 


12.43. Definition: Transitive Tournaments. A tournament t € T,, is transitive iff for 
all u,v, w € [n], (u,v) € t and (v,w) € t imply (u,w) € t. 


Note that (u,v) € t and (v,w) € t force u # w, and then (u,w) ¢ t is equivalent to 
(w,u) € t. It follows that a tournament is not transitive iff there exist u,v,w € [n] with 
(u,v) € t and (v,w) €t and (w,u) € t. 


12.44. Theorem: Generating Function for Transitive Tournaments. Let T/ be the 
set of transitive tournaments in T;,,. Then 


S> sen(t) wt(t) = $2 sen(w) [] tc5- 
k=1 


teT’ WES, 


Proof. We define a bijection f : T/ > S,, that will be used to transfer signs and weights 
from TY, to S,. Given t € T,,, define an associated relation x on [n] by setting u X v iffu = v 
or (u,v) € t. This relation is evidently reflexive, antisymmetric (since t is a tournament), 
and transitive (since t is transitive). Furthermore, u X v or v X wu for all u,v € [n] since t is a 
tournament. So ~ is a total ordering of [n]. This ordering determines a unique permutation 
of the players, namely 

f(t) = w = wy WK we ~ +++ ~ Wp. 


For all k, player w; beats all players w,, for m > k and loses to all players w,, for m < k. 
This remark shows that t is uniquely determined by w, so the map f is a bijection. 

Let us compare inv(t) to inv(w), where w = f(t). Consider two players i = wz and 
j = Wm with i < 7 (so Wm > wz). This pair contributes to inv(t) iff (7,7) € t iff 7 beats 7 
in t iff 7 appears before 7 in w iff m < k iff the letters in positions m,k of w contribute to 
inv(w). So inv(¢) = inv(w) and sgn(t) = sgn(w). Next, let us express wt(t) in terms of w. 
Since player w; beats all players in the range k < m <n, we see that 


522 Buective Combinatorics 


Define wt(w) by the right side of this formula. The theorem now follows because f is a 
weight-preserving, sign-preserving bijection. O 


We can use the bijection f to characterize transitive tournaments. 


12.45. Theorem: Criterion for Transitive Tournaments. A tournament ¢ € T), is 
transitive iff no two vertices in [n] have the same outdegree. 


Proof. If t is transitive, consider w = f(t) = wi <~ wo ~ --- < wy. As shown above, 
wt(t) = [];_,7%,". The exponents n — k are all distinct, so every vertex has a different 
outdegree. Conversely, suppose t € 7), is such that every vertex has a different outdegree. 
There are n vertices and n possible outdegrees (namely 0,1,...,2 — 1), so each possible 
outdegree occurs at exactly one vertex. Let w; be the unique vertex with outdegree n — 1. 
Then w, beats all other players. Next, let w2 be the unique vertex with outdegree n — 2. 
Then wz must beat all players except w,. Continuing similarly, we obtain a permutation 
W = Wj, W2,...,Wn of [n] such that w; beats w, iff 7 < k. To confirm that t is transitive, 
consider three players w;,w,;, wz with (w;,w;) € t and (w;,wz) € t. Then i < j andj <k, 
soi<k, so (wi, wr) € t. Oo 


12.46. Theorem: Vandermonde Determinant Formula. Let 71,...,2%, be fixed el- 
ements in a commutative ring R. Define an n x n matrix V by setting V(i,j) = xj" for 
1<i,j <n. Then 


det(V)= |] (a: —2;). 


1<i<j<n 


Proof. According to the definition of determinants in 9.37, 


det(V) = $0 sen@w) [[ VA, w(k)) = SO sen(w) T] 2th. (12.10) 
k=1 k=1 


weESn wESn 


This is the generating function for transitive tournaments, whereas |]; ;(2i — xj) is the 
generating function for all tournaments with n players. So, it suffices to define a sign- 
reversing, weight-preserving involution I : T, — T;, with fixed point set T’. Define I(t) =t 
for t € T’. Now consider a non-transitive tournament t € T, ~ T/. By 12.45, there exist 
two vertices i < j € [n] with the same outdegree in ¢. If there are several pairs of vertices 
with the same outdegree, choose the pair such that i and then 7 is minimized. Define I(t) 
by switching the roles of i and 7 in t; more precisely, replace every directed edge (u,v) in t 
by (s:,;(u), 8i,;(v)), where s;,; is the transposition (7,7) € S,. The resulting tournament is 
non-transitive (since 7 and j still have the same outdegree in J(t)) and has the same weight 
as t. Furthermore, I(I(t)) = t. 

Finally, we show that sgn(I(t)) = — sgn(t). Consider the factorization of (7,7) € Sp into 
2(j — 7) — 1 basic transpositions: 


(7) =( LAG 2,J Dyeraish Wah 2 aed) ae let ae g = 2g Ne GG 


We can pass from ¢ to I(t) in stages, by applying these basic transpositions one at a time 
to the endpoints of the directed edges in t. We claim that each such step changes the sign 
of the tournament. For, consider what happens to the inversion count when we pass from 
a tournament z to z’ by switching labels k and k + 1. The inversion (k + 1,k) is present 
in exactly one of the tournaments z and z’, and the other inversions are unaffected by the 
label switch. So inv(z’) differs from inv(z) by +1, and hence sgn(z’) = — sgn(z). Since we 
pass from t to I(t) by an odd number of moves of this type (namely 2(7 — 7) — 1), we see 
that sgn(I(t)) = —sgn(t), as desired. Oo 
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12.10 Hook-Length Formula 


This section presents a probabilistic proof of the hook-length formula for the number of 
standard tableaux of a given shape. This formula was first stated in the Introduction. For 
the reader’s convenience, we begin by recalling the relevant definitions. 


12.47. Definitions. An integer partition of n is a weakly decreasing sequence A = (A; > 
Ag >-++ > rz) of positive integers with A; +--- +A, =n. The diagram of X is 


dg(A) = {(4,7) ENxN:1<i<l,1<j <i}. 


Each (i, 7) € dg(A) is called a box or a cell. We take i as the row index and j as the column 
index, where the topmost row is row 1. Given any cell c = (i, 7) € dg(A), the hook of c in A 
is 

H(c) = {(i,k) € dg(A) :k = J} U {(k, 7) € dg(A) : k = 4}. 


The hook-length of c in » is h(c) = |H(c)|. A corner box of X is a cell c € dg(A) with 
h(c) = 1. A standard tableau of shape 2 is a bijection S : dg(A) > {1,2,...,n} such that 
S(i,7) < S(i,7 +1) for all i,7 such that (7,7), (7,7 +1) € dg(A), and S(i,7) < S(i+ 1,7) 
for all i, 7 such that (i, 7), (i+1,7) € dg(A). Let SYT(A) be the set of standard tableaux of 
shape A, and let f* = | SYT(A)|. 


12.48. Example. If \ = (7,5,5,4,2,1) and c = (3,2), then 
H(c) = {(8, 2), (3, 3), (3, 4), (3,5), (4,2), (5, 2)} 


and h(c) = 6. We can visualize dg(A) and H(c) using the following picture. 


Let X/, be the number of boxes in column j of dg(A). Then h(i, 7) = Qi-J) +A, -2) +1 
We use this formula to establish the following lemma. 


12.49. Lemma. Suppose A is a partition of n, (r,s) is a corner box of A, and (7,7) € dg(A) 
satisfies i <r and j < s. Then h(i, j) = h(r,7) + h(i, s) — 1. 


Proof. Since (r,s) is a corner box, \, = s and \, =r. So 


A(r,j)+AG,s)-1 = [Ar-9) +0; -7)+1]4+[OA:-—8)+0,-)4+1)-1 
oa eo, et es see Tae 
= (A-J)+Q)-1)+1=hG,9). O 


12.50. Theorem: Hook-Length Formula. For any partition X of n, 


524 Bijective Combinatorics 


The idea of the proof is to define a random algorithm that takes a partition of n as 
input and produces a standard tableau S € SYT(A) as output. We will prove in 12.55 that 
this algorithm outputs any given standard tableau S with probability 


on Teeaga) h(c) 


n! : 

This probability depends only on A, not on $, so we obtain a uniform probability distribution 
on the sample space SYT(A). So, on one hand, each standard tableau is produced with 
probability p; and on the other hand, each standard tableau is produced with probability 
1/|SYT(A)| = 1/f?. Thus f* = 1/p, and we obtain the hook-length formula. 

Here is an informal description of the algorithm for generating a random standard 
tableau of shape A. Start at a random cell in the shape ». As long as we are not at a 
corner box, we jump from our current box c to some other cell in H(c); each cell in the 
hook is chosen with equal probability. This jumping process eventually takes us to a corner 
cell. We place the entry n in this box, and then pretend this cell is no longer there. We are 
left with a partition py of size n— 1. Proceed recursively to select a random standard tableau 
of shape yt. Adding back the corner cell containing n gives the desired tableau of shape 4. 

Now we give a formal description of the algorithm. Every random choice below is to be 
independent of all other choices. 


12.51. Tableau Generation Algorithm. The input to the algorithm is a partition X of 
n. The output is a tableau S € SYT(A), constructed according to the following random 
procedure. As a base case, if n = 0, return the empty tableau of shape 0. 


1. Choose a random cell c € dg(A). Each cell in dg() is chosen with probability 
1/n. 

2. While h(c) > 1, do the following. 
2a. Choose a random cell c’ € H(c) ~ {c}. Each cell in H(c) ~ {c} is chosen 

with probability 1/(h(c) — 1). 
2b. Replace c by c’ and go back to step 2. 

3. Now cis a corner box of dg(A), so dg(A) ~ {c} is the diagram of some partition 
ps of n—1. Recursively use the same algorithm to generate a random standard 
tableau S’ € SYT(). Extend this to a standard tableau S € SYT(A) by setting 
S(c) =n, and output S as the answer. 


Let (ci, C2,¢3,---,Ck) be the sequence of cells chosen in steps 1 and 2. Call this se- 
quence the hook walk for n. Note that the hook walk must be finite, since h(c,) > h(ce) > 
h(c3) > ++. Writing cs = (is,js) for each s, define J = {t1,...,in-1} ~ {te} and 


J = {j1,---,Jr-1} ~ {in}. We call I and J the row set and column set for this hook 
walk. 


12.52. Example. Given n = 24 and = (7,5,5,4, 2,1), the first iteration of the algorithm 
might proceed as follows. 


Here we will place n = 24 in corner box c4 and proceed recursively to fill in the rest of the 
tableau. The probability that the algorithm will choose this particular hook walk for n is 


1 1 1 1 Le abe hd 
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The row set and column set for this hook walk are J = {1} and J = {2,3}. 


The next lemma is the key technical fact needed to analyze the behaviour of the tableau 
generation algorithm. 


12.53. Lemma. Given a partition \ of n, a corner box c = (r,s), and sets J C {1,2,...,r— 
1} and J C {1,2,...,s — 1}, the probability that the hook walk for n ends at c with row 
set J and column set J is 


1 1 : 
I =— a ea es. 
D(A, ¢, J) 5 leat lea 
i€l jEeJd 


Proof. Write I = {i1 < ig <+-++ < ig} and J = {ji < jo < +++ < jm}, where 0,m > 0. First 
we consider some degenerate cases. Say I = J = . Then the hook walk for n consists of 
the single cell c. This happens with probability 1/n, in agreement with the formula in the 
lemma (interpreting the empty products as 1). Next, suppose I is empty but J is not. The 


hook walk for n in this case must be cy = (1, 1), C2 = (7, J2), +--+) Cm = (1, dm); Cm4i = (1, 8). 
The probability of this hook walk is 

bd 1 i... i 

n h(ci)-1 h(cg)-1 h(emn)-—1 Miey h(r,j) -—1 


Similarly, the result holds when J is empty and I is nonempty. 

Now consider the case where both I and J are nonempty. We will argue by induction on 
\I|+|J|. A hook walk with row set IJ and column set J ending at c must begin with the cell 
C1 = (t41, 91); this cell is chosen in step 1 of the algorithm with probability 1/n. Now, there 
are two possibilities for cell cg: either cz = (41, j2) or co = (t2,J1). Each possibility for cz is 
chosen with probability 1/(Ah(c) — 1) = 1/(h(i1, 71) — 1). When c2 = (41, j2), the sequence 
(c2,..-,Ck) is a hook walk ending at c with row set I and column set J’ = J ~ {ji}. By 
induction, such a hook walk occurs with probability 


1 Il 1 Il 1 
es h(i,s) -—1 ey A(r,j) -1 


However, since the walk really started at c; and proceeded to cz, we replace the first factor 
1/n by + . Weyct: Similarly, when cp = (é2, 71), the sequence (c2,...,cx) is a hook walk 
ending at c with row set I’ = I ~ {i,} and column set J. So the probability that the hook 
walk starts at c; and proceeds through c2 = (ig, 1) is 


1 1 1 1 
n Wayai lb aeyai ley 


Adding these two terms, we see that 
1 1 1 1 1 1 
I =>-: ——__—— —___—_ - ———— ————— 7 
ROO a ey U h(i,s)—1 Il h(r,j) — 1 (ay =i AGG) = :) 
The factor in parentheses is 
A(r, ja) + h(t, 8) — 2 
(h(i1, 8) — 1)(A(r, ja) — 1) 


Using 12.49, the numerator simplifies to h(#1, 71) — 1 = h(c,) — 1. This factor cancels and 
leaves us with 


1 uf : 
rN.c. I =— Wye a yaaa mar 
p( iC, iJ) =f i renee | Eacepen 
iel jEeJd 


This completes the induction proof. O 
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12.54. Theorem: Probability that a Hook Walk ends at c. Given a partition » of n 
and a corner box c = (r,s) of dg(A), the probability that the hook walk for n ends at c is 


Proof. Write [r — 1] = {1,2,...,r — 1} and [s — 1] = {1,2,...,s — 1}. By the sum rule for 
probabilities, 


p(A,c) = Se ye (A, ¢, I, J) 


IC[r—1] JC[s—1] 
1 


1 1 
aS » Uigaalligga 


n . ’ 
IC[r—1] JC[s—1] te] 


1 1 
~ on De Lgeyn 2 Ungn 


IC{[r—1] eI JIC [s—1] jEF 


By 2.7, we have 


fi. 1 _T h(i,s) 
Ss Wares =I (+55) > raya 


IC([r—1] i€1 i=1 
The sum over J can be simplified in a similar way, giving the formula in the theorem. OU 
The next theorem is the final step in the proof of the hook-length formula. 


12.55. Theorem: Probability of Generating a Given Tableau. If \ is a partition of 
nand S € SYT(A), the tableau generation algorithm outputs S with probability 


Tee dg(A) h(c) 


n! 


Proof. We prove the theorem by induction on n. Note first that the result does hold for 
n = 0 and n = 1. For the induction step, assume the result is known for partitions and 
tableaux with fewer than n boxes. Let c* = (r,s) be the cell such that S(c*) = n, let uw 
be the partition obtained by removing c* from dg(A), and let S’ € SYT(y) be the tableau 
obtained by erasing n from S. First, the probability that the hook walk for n (in steps 1 
and 2 of the algorithm) ends at c* is p(\,c*). Given that this has occurred, induction tells 
us that the probability of generating S$” in step 3 is 


TTecdg(x) hy lc) 
(n-1)! ? 


where h,,(c) refers to the hook length of ¢ relative to dg(j). Multiplying these probabilities, 
the probability of generating S is therefore 


r-1 - s—l ; 
hy(i, s) hy(r,j) 
I] woe" oT = 
 cédg(12) eer? j=l ha(r,j) — 1 


Now, consider what happens to the hook lengths of cells when we pass from p to A by 
restoring the box c* = (r,s). For every cell c = (i,j) € dg(u) with 1 4 r and j # s, we 
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have h,,(c) = hy(c). If ce = (4,8) € dg(u) with i < r, then h,,(c) = hy(c) —1 = hy(i,s) — 1. 
Thus, the fractions in the second product convert h,,(c) to hy(c) for each such c. Similarly, 
ifc = (r,j) € dg(u) with j < s, then h,(c) = hy(c) — 1 = hy(r,j) — 1. So the fractions in 
the third product convert h,,(c) to hy(c) for each such c. So we are left with 


“ II ha(o) = Leeder AO) 


n! : 


” c€dg(p) 


where the last equality follows since h,(c*) = 1. This completes the induction. oO 


12.11 Knuth Equivalence 


Let X be a totally ordered set, and let X* = U5) X” be the set of all words over the 
alphabet X. Given a word w € X*, we can use the RSK algorithm to construct the insertion 
tableau P(w), which is a semistandard tableau using the same multiset of letters as w 
(§10.23). This section studies some of the relationships between w and P(w). In particular, 
we show that the shape of P(w) contains information about increasing and decreasing 
subsequences of w. First we show how to encode semistandard tableaux using words. 


12.56. Definition: Reading Word of a Tableau. Let T € SSYTx(A), with A = 
(A1,---;Ax). The reading word of T is 


tw(T) = T(k,1),T(k,2),...,T(k, Ax), T(k— 1,1), T(k -1,2),...,T(k—1,An-1),---, 
F(; 1), Ea, 2), Sires T(, A1). 
Thus, rw(T) is the concatenation of the weakly increasing words appearing in each row 
of T’, reading the rows from bottom to top. Note that T(j,A;) > T(j,1) > TG — 1,1) for 


all 7 > 1. This implies that we can recover the shape of T from rw(T) by starting a new 
row whenever we see a strict descent in rw(T). 


12.57. Example. Given the tableau 


the reading word of T is 
rw(T) = 463578245661123446. 


Given that the word w = 7866453446223511224 is the reading word of some tableau S, we 
deduce that S must be 


by looking at the descents in w. 


Next we introduce two equivalence relations on X* that are related to the map w 
P(w). 
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12.58. Definition: P-Equivalence. Two words v,w € X* are called P-equivalent, de- 
noted v =p w, iff P(v) = P(w). 


12.59. Definition: Knuth Equivalence. The set of elementary Knuth relations of the 
first kind on X is 


ky = {(uyxzv, uyzrv) : u,v € X*,2,y,2€ X anda <y< 2}. 
The set of elementary Knuth relations of the second kind on X is 
Ko = {(uxzyv, uzryv) : u,v € X*,2,y,2€ X anda <y < 2}. 


Two words v,w € X* are Knuth equivalent, denoted v =x w, iff there is a finite sequence 
of words v = v°, v1, v?,...,v* = w such that, for 1 < i < k, either (v1, v") € Ky, U Ke or 
(au) €kyU Ko. 


12.60. Remark. Informally, Knuth equivalence allows us to modify words by repeatedly 
changing subsequences of three consecutive letters according to certain rules. Specifically, 
if the middle value among the three letters does not occupy the middle position, then the 
other two values can switch positions. To determine which value is the “middle value” in the 
case of repeated letters, use the rule that the letter to the right is larger. These comments 
should aid the reader in remembering the inequalities in the definitions of kK, and Ko. 


It is routine to check that =p and =x are equivalence relations on X*. Our current goal 
is to prove that these equivalence relations are actually the same. First we show that we 
can simulate each step in the tableau insertion algorithm 10.52 using the elementary Knuth 
relations. 


12.61. Theorem: Reading Words and Knuth Equivalence. For all v € X*, v =x 
rw(P(v)). 


Proof. First note that, for any words u,z,w,w’ € X*, if w =x w’ then uwwz =K uw'z. 
Now, write v = vj v2---vxz and argue by induction on k. The theorem holds if k < 1, 
since rw(P(v)) = v in this case. For the induction step, assume & > 1 and write 
T’ = P(vyve-:-vg_-1), T = P(v). By the induction hypothesis, v1---vp,-1 =x rw(T"), 
so VU = (U1 +++ Up—-1)Uk =K rw(T")vx. It will therefore suffice to prove that rw(T’)v;, is Knuth 
equivalent to rw(T) = rw(T’ < vz). This will be proved by induction on @, the number of 
rows in the tableau T’. 

For the base case, let ¢ = 1. Then rw(T”) is a weakly increasing sequence u U2 -+ + UR-1.- 
If up_1 < vg, then T is obtained from T’ by appending v, at the end of the first row. In 
this situation, rw(T")u, = uy-+-Up—10~ = rw(T), so the desired result holds. On the other 
hand, if vp < up—1, let j be the least index with vw, < u;. When inserting vy, into T’, vz, will 
bump wu, into the second row, so that 


rw(T) = ujyutig- + Uj—-1U RUG 41 °° UR-1- 


Let us show that this word can be obtained from uy; ---ug—1vz by a sequence of elementary 
Knuth equivalences. If 7 < k — 2, then vp < uzg—2 < uz—1 implies 


(ur st Uk—-3Uk—2UKUk-1, U1 °° -Uk—3Uk—2Uk—1Uk) € Ky. 


So rw(T")vu, is Knuth-equivalent to the word obtained by interchanging vz with the letter 
Ur—1 to its immediate left. Similarly, if 7 < k — 3, the inequality vy < ux—3 < ux—2 lets us 
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interchange v;, with uz_2. We can continue in this way, using elementary Knuth relations 
of the first kind, to see that 


/ — 
rw(T")vp SK U1 Uj 1 Uj Velj 41 ++ Uk—1- 


Now, we have u;—1 < vp < uj, so an elementary Knuth relation of the second kind trans- 
forms this word into 
Uj ++ Uj—-QUjUjZ—1UKUj41 °° * Uk-1- 


If 7 > 2, we now have uj;-2 < uj;_-1 < uj, So we can interchange u; with u;-2. We can 
continue in this way until u; reaches the left end of the word. We have now transformed 
rw(T")vu, into rw(T’) by elementary Knuth equivalences, so rw(T’)u, =K rw(T). 

For the induction step, assume f > 1. Let T” be the tableau T’ with its first (longest) 
row erased. Then rw(T") = rw(I”)u1-+- Up where uy < --: < Uy is the weakly increasing 
sequence in the first row of T’. If up < vg, then rw(Z’)vu, = rw(T). Otherwise, assume vy, 
bumps wu, in the insertion T’ — vz. By the result in the last paragraph, 


rw(T" og =K rw(T" )ujur ++ Uj-1UkUj 41 +++ Up. 
Now, by the induction hypothesis, rw(I’”’)u; =x rw(I” — u;). Thus, 
rw(I")vg =x tw(T” — u,)u’ 


where wu’ is ui ---Up with u; replaced by v;. But, by definition of tableau insertion, rw(Z’”’ — 
u;)u’ is precisely rw(T'). This completes the induction step. Oo 


12.62. Example. Let us illustrate how elementary Knuth equivalences implement the steps 


in the insertion T ~— 3, where 
[1 {1{3]4] 4] 6) 
T =(2]2/4]5| 
[3 |4 


Appending a 3 at the right end of rw(T), we first compute 
34 2245 113446 3 =K 34 2245 1134436 =K 34 2245 1134346 =K 
34 2245 1143346 =K 34 2245 1413346 =~ 34 2245 4 113346. 


The steps so far correspond to the insertion of 3 into the first row of T, which bumps the 
leftmost 4 into the second row. Continuing, 


34 22454 113346 = 34 22544 113346 =~ 34 25244 113346 =K 34 5 2244 113346, 


and now the incoming 4 has bumped the 5 into the third row. The process stops here with 


the word 
[1] 1]/3[3]4]6] 
3452244113346 = rw =rw(T < 3). 
[3 [4] 5) 


This illustrates that rw(T)3 =x rw(T < 3). 


12.63. Definition: Increasing and Decreasing Subsequences. Let w = wiwe2--- Wn € 
X*. An increasing subsequence of w of length € is a subset I = {i1 < tg <--- < ig} of 
{1,2,...,n} such that w;, < wi, < ++: < wy,. A decreasing subsequence of w of length £ 
is a subset I = {i, < ig < +--+ < ig} such that w;, > wi, > --- > w;,. A set of k disjoint 
increasing subsequences of w isaset {I1,..., I} of pairwise disjoint increasing subsequences 
of w. For each k > 1, let incy(w) be the maximum value of |J;| +--+ + |Z,| over all such 
sets. Similarly, let dec,(w) be the maximum total length of a set of k disjoint decreasing 
subsequences of w. 
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12.64. Theorem: Knuth Equivalence and Monotone Subsequences. For all v,w € 
X* and all k > 1, v =x w implies inc,(v) = inc,(w) and dec; (v) = dec; (w). 


Proof. It suffices to consider the case where v and w differ by a single elementary Knuth 
relation. First suppose 


v=ayrzb, w=ayzrb, (4#<y<z) 


where the y occurs at position 7. If J is an increasing subsequence of w, then 7 + 1 and 
i +2 do not both belong to I (since z > x). Therefore, if {f1,...,J,} is any set of k 
disjoint increasing subsequences of w, we can obtain a set {Ij,..., Ij,} of disjoint increasing 
subsequences of v by replacing 1+ 1 by i+ 2 andi+2 by 72+1 in any J; in which one of 
these indices appears. This implies that inc,(w) < inc, (v). 

To establish the opposite inequality, let I = {hh, lo,...,J,} be any set of k disjoint 
increasing subsequences of v. We will construct a set of & disjoint increasing subsequences 
of w having the same total size as I. The device used in the previous paragraph works here, 
unless some member of I (say J,) contains both i+ 1 and i+ 2. In this case, we cannot have 
i € I, since y > x. If no other member of I contains 7, we replace , by (1 ~ {i+ 2})U {2}, 
which is an increasing subsequence of w. On the other hand, suppose 2+ 1,7+ 2 € J,, and 
some other member of I (say Jz) contains 7. Write 


Th = {hr <a <c0+ <p S441 <4 42 < rg <0 < Gp}, 
In = {ki < kg <--+< hs <i< key <-++ < kg}, 


and note that v;, << z< vj,,, and vz, < y < vg, ,,. Replace these two disjoint increasing 
subsequences of v by 


R= {hi <go<e++ < jp <442< hg < +++ < hgh, 
Th = {hy <kp see Shy <4 <G41< frp <0 < phe 


Since w;, < x < we,,, and we, < y < z < wy;,,,, , and J are two disjoint increasing 
subsequences of w having the same total length as J; and Jz. This completes the proof that 
inc, (w) > inc,(v). 

Similar reasoning (left as an exercise for the reader) proves the result in the case where 


v=arzyb, w=azxryb, (a1<y<2z). 
We also let the reader prove the statement about decreasing subsequences. oO 


12.65. Theorem: Subsequences and the Shape of Insertion Tableaux. Let w € X* 
and suppose P(w) has shape X. For all k > 1, 


incg(w) = Ar +--+ + Ak, decy(w) = Ay +++ + AK. 


In particular, A; is the length of the longest increasing subsequence of w, whereas ¢(A) is 
the length of the longest decreasing subsequence of wi. 


Proof. Let w’ = rw(P(w)). We know w =x w’ by 12.61, so incg(w) = inc,(w’) and 
dec, (w) = decz,(w’) by 12.64. So we need only prove 


inc, (w’) =Ar+-+-+Aw, — decx(w") = Ay $e + Aj. 


Now, w’ consists of increasing sequences of letters of successive lengths ;,...,A2, 1 (where 
1 = &(A)). By taking I, Io,..., I, to be the set of positions of the last k of these sequences, we 
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obtain k disjoint increasing subsequences of w’ of length A, +---+A,. Therefore, inc, (w’) > 
Ay ters + Ag. 

On the other hand, let {,,..., J} be any k disjoint increasing subsequences of w’. Each 
position 7 in w’ is associated to a particular box in the diagram of A, via 12.56. For example, 
position 1 corresponds to the first box in the last row, while the last position corresponds 
to the last box in the first row. For each position i that belongs to some J;, place an X in 
the corresponding box in the diagram of X. Since entries in a given column of P(w) strictly 
decrease reading from bottom to top, the X’s coming from a given increasing subsequence 
I; must all lie in different columns of the diagram. It follows that every column of the 
diagram contains k or fewer X’s. Suppose we push all these X’s up their columns as far as 
possible. Then all the X’s in the resulting figure must lie in the top & rows of X. It follows 
that the number of X’s, which is |J,| +---+ |J;|, cannot exceed A; +--+ + A,. This gives 
inc,(w’) < Ay +---+ Ax. The proof for dec;,(w) is similar, and is left as an exercise. O 


12.66. Theorem: Knuth Equivalence vs. Tableau Shape. For all v,w € X*, v =K w 
implies that P(v) and P(w) have the same shape. 


Proof. Let » and pu be the shapes of P(v) and P(w), respectively. Using 12.64 and 12.65, 
we see that v =K w implies 


Ak = inc, (v) — incg_1(v) = incg(w) — incg_1(w) = wp (kK > 1). oO 


12.67. Example. Consider the word w = 35164872. As shown in Figure 10.1, we have 


1] 2/6} 7) 
P(w) =|3]4]8] 
[5 | 


Since the shape is \ = (4,3,1), the longest increasing subsequence of w has length 4. 
Two such subsequences are J; = {1,2,4,7} (corresponding to the subword 3567) and Ip = 
{1,2,4,6}. Note that the first row of P(w), namely 1267, does not appear as a subword of 
w. Since the column lengths of X are (3, 2, 2,1), the longest length of two disjoint decreasing 
subsequences of w is 3+ 2 = 5. For example, we could take 1, = {6,7,8} and Ip = {4,5} to 
achieve this. Note that w’ = rw(P(w)) = 5 348 1267. To illustrate the end of the previous 
proof, consider the two disjoint increasing subsequences J; = {1,4} and Iz = {2,3,7,8} of 
w’ (this pair does not achieve the maximum length for such subsequences). Drawing X’s in 
the boxes of the diagram associated to the positions in I; (resp. Iz) produces 


| | | | |_|X/X] 
i [i resp. XIXT |). 
LX] ee 
Combining these diagrams and pushing the X’s up as far as they will go, we get 
[X{X[X]X] 
x] LX] 
|| 
So, indeed, the combined length of J; and Iz does not exceed A, + A2. 


The next lemma provides the remaining ingredients needed to establish that P- 
equivalence and Knuth equivalence are the same. 


12.68. Lemma. Suppose v,w € X* and z is the largest letter appearing in both v and 
w. Let v’ (resp. w’) be the word obtained by erasing the rightmost z from v (resp. w). If 
v =x w, then vw’ =x w’. Furthermore, if T = P(v) and T’ = P(v’), then T’ can be obtained 
from T by erasing the rightmost box containing z. 
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Proof. Write v = azb and w = czd where a,b,c,d € X* and z does not appear in b or 
d. First assume that v and w differ by a single elementary Knuth relation. If the triple 
of letters affected by this relation are part of the subword a, then a =x c and b = d, so 
v' =ab=K cd =’. Similarly, the result holds if the triple of letters is part of the subword 
b. The next possibility is that 


v=ayxr2b, w=a'yzxb (1<y<z) 

(or vice versa). Then v’ = a’yxb = w’, so certainly v’ =x w’. Another possibility is that 
veaazyb', w=a'zryb! (x <y<z) 

(or vice versa), and again v’ = a’xyb’ = w’. Since the z under consideration is the rightmost 


occurrence of the largest letter in both v and w, the possibilities already considered are the 
only elementary Knuth relations that involve this symbol. So the result holds when v and 


w differ by one elementary Knuth relation. Now, if v = v°, v',v?,...,v* = w is a sequence 
of words as in 12.59, we can write each v' = a’zb’ where z does not appear in b’. Letting 
(v‘)' = a'b! for each i, the chain v’ = (v°)’, (v1)’,...,(v®)’ = w" proves that v! =K w’. 


Now consider the actions of the tableau insertion algorithm applied to v = azb and to 
v’ = ab. We prove the statement about T and T’ by induction on the length of b. The 
statement holds if b is empty. Assume 6 has length k > 0 and the statement is known 
for smaller values of k. Write b = b/a where x € X. Then Tj = P(ab’) is the tableau 
T, = P(azb’) with the rightmost z erased. By definition, T’ = Tj — v andT = T, < «. 
When we insert the x into these two tableaux, the bumping paths will be the same (and 
hence the desired result holds), unless « bumps the rightmost z in T;. If this happens, the 
rightmost z (which must have been the only z in its row) will get bumped into the next 
lower row. It will come to rest there without bumping anything else, and it will still be the 
rightmost z in the tableau. Thus it is still true that erasing this z in T produces T’. The 
induction is therefore complete. O 


12.69. Theorem: P-Equivalence vs. Knuth Equivalence. For all v,w € X*, v =p w 
ifv=K w. 


Proof. First, if v =p w, then 12.61 shows that v =xK rw(P(v)) = rw(P(w)) =x w, so 
v =k w by transitivity of =x. Conversely, assume v =K w. We prove v =p w by induction 
on the length & of v. For k < 1, we have v = w and so v=xK w. Now assume k > 1 and the 
result is known for words of length k — 1. Write v = azb and w = czd where z is the largest 
symbol in v and w and z does not occur in b or d. Write v' = ab and w’ = cd. By 12.68, 
v' =x wu’, P(v’) is P(v) with the rightmost z erased, and P(w’) is P(w) with the rightmost 
z erased. By induction, P(v’) = P(w’). If we knew that P(v) and P(w) had the same shape, 
it would follow that P(v) = P(w). But P(v) and P(w) do have the same shape, thanks 
to 12.66. So v=p w. O 


We conclude with an application of 12.65. 


12.70. Erd6s-Szekeres Subsequence Theorem. Every word of length exceeding mn 
either has an increasing subsequence of length m+ 1 or a decreasing subsequence of length 
n+. 


Proof. Suppose w is a word with no increasing subsequence of length m+1 and no decreasing 
subsequence of length n+1. Let \ be the shape of P(w). Then 12.65 implies that A1 < m and 
£(A) <n. Therefore the length of w, which is |A|, can be no greater than Ay4(A) < mn. O 
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12.12 Pfaffians and Perfect Matchings 


Given a square matrix A with N rows and N columns, we have defined the determinant of 


A by the formula 
N 


det(A) = S > sgn(w) [J AG, w(d)). 


weSsn i=1 


This section studies Pfaffians, which are numbers associated to a triangular array of numbers 
(aij : 1< i<j < N) where N is even. Pfaffians arise in the theory of skew-symmetric 
matrices. 


12.71. Definition: Skew-Symmetric Matrices. An N x N matrix A is called skew- 
symmetric iff At = —A iff A(i, 7) = —A(y,7) for 1 < i,j < N. 


If A is a real or complex skew-symmetric matrix, then A(i,i) = 0 for all i. Moreover, 
A is completely determined by the triangular array of numbers (A(i,j):1<i<j< N) 
lying strictly above the main diagonal. The starting point for the theory of Pfaffians is the 
observation that, for N even and A skew-symmetric, det(A) is always a perfect square. (For 
N odd, the condition A‘ = —A can be used to show that det(A) = 0.) 


12.72. Example. A general skew-symmetric 2 x 2 matrix has the form A = me 


In this case, det(A) = a? is a square. A skew-symmetric 4 x 4 matrix looks like 


0 a be 
A=| 5 “a : j 
—-c -e -—f 0 
A somewhat tedious calculation reveals that 
det(A) = a?f?+0b%e? + 2d? — 2abef + 2acdf — 2bcde 


I 


(af + cd— be)?. 


The remainder of this section develops the theory needed to explain the phenomenon 
observed in the last example. 


12.73. Definition: Pfaffians. Suppose N is even and A is a skew-symmetric N x N 
matrix. Let SPfy be the set of all permutations w € Sy such that 


Wy <W3 << W5 St SC WN-1, Wi < W2, W3 < W4, W5 < We, ---, WN-1 < WN. 


The Pfaffian of A, denoted Pf(A), is the number 


P£(A)= S > sgn(w)A(wi, we) A(ws, ws) A(ws, we) +++ A(wy—1, wy). 
wesPfn 


12.74. Example. If N = 2, SPfz = {12} and Pf(A) = A(1,2) (we write permutations in 
one-line form here). If N = 4, SPf4 = {1234, 1423, 1324} and 


Pf(A) = A(1, 2)A(3, 4) + A(1, 4) A(2, 3) — A(1, 3)A(2, 4). 
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For a general N x N matrix A, det(A) is a sum of |Sy| = N! terms. Similarly, for a 
skew-symmetric matrix A, Pf(A) is a sum of |SPfy | terms. 


12.75. Theorem: Size of SPfy. For each even N, |SPfy |=1x3x5x---x (N-1). 


Proof. We can construct each permutation w € SPfy as follows. First, w; must be 1. There 
are N — 1 choices for wa, which can be anything other than 1. To finish building w, choose 
an arbitrary permutation v = v,v2---un—2 € SPfn_o2. For 1 <i< N — 2, set 


oe Ite ed, AO = 1 
ae v;+2 otherwise. 


Informally, we are renumbering the v’s to use symbols in {1,2,..., N} ~ {wi = 1, we} and 
then appending this word to wiw2. By the product rule, |SPfy | = (N — 1) x |SPfy_2|. 
Since |SPf2| = 1, the formula in the theorem follows by induction. Oo 


Recall that the Laplace expansions in 9.48 provide recursive formulas for evaluating 
determinants. Similar recursive formulas exist for evaluating Pfaffians. The key difference 
is that two rows and columns get erased at each stage, whereas in Laplace expansions only 
one row and column get erased at a time. 


12.76. Theorem: Pfaffian Expansion along Row 1. Suppose N is even and A is an 
N x N skew-symmetric matrix. For each i < j, let A[{i,7]] be the matrix obtained from A 
by deleting row 7 and row 7 and column i and column 7; this is a skew-symmetric matrix 
of size (N — 2) x (N — 2). We have 


N 
= 0 (-1)7 AG, 4) PEATE J]. 


j=2 


Proof. By definition, 


N 
Pf&(A)= S° sgn(w) [] Alwi, wis). 


weSPfin j=l 


By the proof of 12.75, there is a bijection SPfy — {2,3,...,N} x SPfy_—2 that maps 
w € SPfy to (j,v), where 7 = we and v is obtained from w3w4---wy by renumbering 
the symbols to be 1,2,...,N — 2. We will use this bijection to change the indexing set 
for the summation from SPfy to {2,...,N} x SPfy_—2. Counting inversions, we see that 
inv(w) = inv(v) + 7 — 2 since w. = j exceeds j — 2 symbols to its right. So sgn(w) = 
(—1)/ sen(v). Next, A(w1, w2) = A(1,7). For odd i > 1, it follows from the definitions that 
A(wi, Wit1) = All, J]](vi—2, vi-1). Putting all this information into the formula, we see that 


N N-2 
P£(A) re S>(-1)' AG, 9) > sgn(u ) TT Al Al [1 7] ](vi, Vi41)- 
j=2 vESPfin_2 2 et 
The inner sum is precisely Pf(A[[1, 7]]), so the proof is complete. Oo 


12.77. Example. Let us compute the Pfaffian of the matrix 


0 x --y 0 0 O 

—x 0 0 y 0 0 

= y 0 0 x -y 0 
a 0 -y -a 0 O y 
0 0 y 0 0 « 

0 0 0 -y -a 0 
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FIGURE 12.14 
Graph used to illustrate perfect matchings. 


Expanding along row 1 gives 


oe es oO 


0 yeast 0. 0 -y -«& 


By expanding these 4 x 4 Pfaffians in the same way, or by using the formula in 12.74, we 
obtain 
Pf(A) = x(x? + y?) + y(wy) = 2° + 2ay?. 


The combinatorial significance of this Pfaffian evaluation will be revealed in §12.13. 
Pfaffians are closely related to perfect matchings of graphs, which we now discuss. 


12.78. Definition: Perfect Matchings. Let G be a simple graph with vertex set V and 
edge set E. A perfect matching of G is a subset M of E such that each v € V is the endpoint 
of exactly one edge in M. Let PM(G) be the set of perfect matchings of G. 


12.79. Example. For the graph shown in Figure 12.14, one perfect matching is 


M, = {{1, 6}, {2, 10}, {3, 9}, {4, 8}, {5, 7H}. 


Another perfect matching is 


Mp2 = {{1, 2}, {3, 4}, {5, 7}, {6, 9}, {8, 1O}}. 


A perfect matching on a graph G is a set partition of the vertex set of G into blocks of 
size 2 where each such block is an edge of G. Therefore, if G has N vertices and a perfect 
matching exists for G, then N must be even. The next result shows that perfect matchings 
on a complete graph can be encoded by permutations in SPfy. 


12.80. Theorem: Perfect Matchings on a Complete Graph. Suppose N is even and 
Ky is the simple graph with vertex set {1,2,..., N} and edge set {{7,j} :1<i<j<N}. 
The map f : SPf{y — PM(Ky) defined by 


f(wiwe:+-wn) = {{w1, wo}, {w3, wa},..., {wn_-1, wn ft} 
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is a bijection. Consequently, 
|PM(Ky)| =1x3x5x---x(N-1). 


Proof. Note first that f does map into the set PM(Ky). Next, a matching M € PM(Ky) 
is a set of N/2 edges M = {{i1, ig}, {t3, ia},..., {tw-1,inw}}. Since {7,7} = {7,7}, we can 
choose the notation so that i, < ig, ig < t4,..., and in_ 1 < ty. Similarly, since the N/2 
edges of M can be presented in any order, we can change notation again (if needed) to 
arrange that i; < 13 < is <-+-: < in_y. Then the permutation w = 11i913---tny € SPfy 
satisfies f(w) = M. Thus f maps onto PM(Ky). To see that f is one-to-one, suppose 
v = jijoj3-+:jn is another element of SPfy such that f(v) = M = f(w). We must have 
ji = 1 = %,. Since M has only one edge incident to vertex 1, and since {i1,i2} € M and 
{ji,j2} € M by definition of f, we conclude that i2 = je. Now i3 and j3 must both be 
the smallest vertex in the set {1,2,...,N} ~ {t1,i2}, so 73 = j3. Then i4 = j4 follows, as 
above, since M is a perfect matching. Continuing similarly, we see that 7, = jx for all k, 
so v = w and f is one-to-one. Since f is a bijection, the formula for |PM(icy)| follows 
from 12.75. 


The preceding theorem leads to the following combinatorial interpretation for Pfaffians. 
Given a perfect matching M € PM(Ky), use 12.80 to write M = f(w) for some w € SPfy. 
Define the sign of M to be sgn(w), and define the weight of M to be 


N 
wt(M) — II Lj = II Cw; wir 


LjtEM t=1 
{i,j} E 4 odd 


where the a;,; (for 1 < i < 7 < N) are indeterminates. Let X be the skew-symmetric 
matrix with entries 2;,; above the main diagonal. It follows from 12.80 and the definition 
of a Pfaffian that 


S* sgn(M) wt(M) = P£(X). 
MEPM(Kn) 


More generally, we have the following result. 
12.81. Theorem: Pfaffians and Perfect Matchings. Let N be even, and let G bea 
simple graph with vertex set V = {1,2,..., N} and edge set E(G). Let X = X(G) be the 
skew-symmetric matrix with entries 
U5 ifa <j and {i,j} € E(G) 
X(i,j) =< -aijy ift > 7 and {i,j} € E(G) 
0 otherwise. 


Then > yepm(a) 83a(M) wt(M) = Pi(X(G)). 
Proof. We have already observed that 


S > sgn(M) wt(M) = Pf(X(Ky)). (12.11) 
MEPM(Kn) 


Given the graph G, let € be the evaluation homomorphism (see 7.102) that sends 2;,; to 2;,; 
if {1,7} € E(G), and sends x;,; to 0 if {7,7} ¢ E(G). Applying « to the left side of (12.11) 
produces 

S > sgn(M) wt(M), 


MeEPM(G) 
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since all matchings of Ky that use an edge not in E(G) are mapped to zero. On the other 
hand, since € is a ring homomorphism and the Pfaffian of a matrix is a polynomial in the 
entries of the matrix, we can compute e(Pf(X (Ky,))) by applying € to each entry of X (Ky) 
and taking the Pfaffian of the resulting matrix. So, applying ¢€ to the right side of (12.11) 
gives 

e(Pf(X(Kw))) = Pf(e(X(Kw))) = P£(X(G)). Oo 


12.82. Remark. The last result shows that Pf(X(G)) is a signed sum of distinct mono- 
mials, where there is one monomial for each perfect matching of G. Because of the signs, 
one cannot compute |PM(G)| by setting 2; = 1 for each {i,7} € E(G). However, for 
certain graphs G, one can introduce extra signs into the upper part of the matrix X(G) to 
counteract the sign arising from sgn(M). This process is illustrated in the next section. 


We can now give a combinatorial proof of the main result linking Pfaffians and deter- 
minants. 


12.83. Theorem: Pfaffians vs. Determinants. For every even N and every N x N 
skew-symmetric matrix A, det(A) = Pf(A)?. 


Proof. First we use the skew-symmetry of A to cancel some terms in the sum 


N 
det(A) = S© sgn(w) II A(i, w(2)). 


weSn 


We will cancel every term indexed by a permutation w whose functional digraph contains 
at least one cycle of odd length (cf. §3.6). If w has a cycle of length 1, then w(t) = é for 
some 7. So A(i, w(z)) = A(z,7) = 0 by skew-symmetry, and the term indexed by this w 
is zero. On the other hand, suppose w has no fixed points, but w does have at least one 
cycle of odd length. Among all the odd-length cycles of w, choose the cycle (i1,%2,...,%%) 
whose minimum element is as small as possible. Reverse the orientation of this cycle to 
get a permutation w’ 4 w. For example, if w = (3,8,4)(2,5,7)(1,6)(9,10), then w’ = 
(3, 8, 4)(7, 5, 2)(1,6)(9, 10). In general, sgn(w’) = sgn(w) since w and w’ have the same 
cycle structure (see 9.34). However, since k is odd and A is skew-symmetric, 


A(i1, i2)A(t2, 13) +++ A(in—1, th) A(te, 41) = —A(ta, 21) A(iz, ta) --- A(te, tx—1) A(ir, te). 


It follows that the term in det(A) indexed by w’ is the negative of the term in det(A) 
indexed by w, so this pair of terms cancels. Since w +> w’ is an involution, we conclude that 


N 


det(A) = S- sen(w) |] A(é, w(a)), 


wes? i=1 


where S‘’ denotes the set of permutations of N objects with only even-length cycles. 
The next step is to compare the terms in this sum to the terms in Pf(A)?. Using the 
distributive law to square the defining formula for Pf(A), we see that 


Pf(A)Y?= S$) So sgn(u)sgn(v) [] (Ales, wey) A(vi, viva). 


u€SPfn veSPin 4 odd 


Given w € S<? indexing an uncanceled term in det(A), we associate a pair (u,v) € SPfh 
indexing a summand in Pf(A)? as follows. Consider the functional digraph G(w) with vertex 
set {1,2,...,N} and edge set {(2, w(z)) : 1 <7 < N}, which is a disjoint union of cycles. 
Define a perfect matching M; on G(w) (viewed as an undirected graph) by starting at the 
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minimum element in each cycle and including every other edge as one travels around the 
cycle. Define another perfect matching Mz on G(w) by taking all the edges not used in M. 
Finally, let u and v be the permutations in SPfy that encode M, and Mg via the bijection 
in 12.80. For example, if w = (1,5, 2,8,6,3)(4,7), then M, = {{1,5}, {2,8}, {6, 3}, {4, 7}} 
and Mz = {{5, 2}, {8, 6}, {3, 1}, {7, 4}}, so wu = 15283647 and v = 13254768. The association 
w+ (u,v) is a bijection from SS to SPf>,. To compute the inverse map, one need only 
take the union of the perfect matchings encoded by u and v. This produces a graph that is a 
disjoint union of cycles of even length, as is readily checked. One can restore the directions 
on each cycle by recalling that the outgoing edge from the minimum element in each cycle 
belongs to the matching encoded by wu. For example, the pair (u’, v’) = (15234867, 12374856) 
maps to w’ = (1,5,6,7,3,2)(4,8) under the inverse bijection. 

Throughout the following discussion, assume w € 5% corresponds to (u,v) € SPf. To 
complete the proof, it suffices to show that the term in det(A) indexed by w equals the 
term in Pf(A)? indexed by (u,v). Write w in cycle form as 


w = (m4, N41,...,21)(Me,N2,...,22) ++: (Mn, Nk, -- +; Zk) 


where m1, < m2 <-+: < mx are the minimum elements in their cycles. Define two words 
(permutations in one-line form) 


u* — MN +++ 2 MagNQ+++ZQ +++ MEN Zk} 


* — Myss* ZyMy, NgQ+++ ZIM eee Nhs ZRhIMNk- 
Thus u* is obtained by erasing the parentheses in the particular cycle notation for w just 
mentioned, and v* is obtained similarly after first cycling the values in each cycle one step 
to the left. Since each m, is the smallest value in its cycle, it follows that 


inv(u*) = N —k + inv(u*) (k = cyc(w)). 


Therefore sgn(u*) sgn(v*) = (—1)%~°ve(™) = sen(w). Since all the edges (i, w(i)) in G(w) 
arise by pairing off consecutive letters in u* and v*, we have 


N 
sgn(w) []4@«@) = sen(u*)sen(v*) |] [A(u?, wi )A(ot, oF). 


4 odd 


We now transform the right side to the term indexed by (u,v) in Pf(A)?, as follows. Note 
that the words u* and v* provide non-standard encodings of the perfect matchings M1 
and Mp: encoded by u and v (the edges of the matchings are found by grouping pairs of 
consecutive symbols in u* and v*). To get to the standard encodings, first reverse each pair of 
consecutive letters uj, uj, in u* such that uj > uj,, and 7 is odd. Each such reversal causes 
sgn(u*) to change, but this change is balanced by the fact that A(uj,,, uj) = —A(uj, uj,1). 
Similarly, we can reverse pairs of consecutive letters in v* that are out of order. The next 
step is to sort the pairs in u* to force uy < u3 < Us < +--+: < un_y. This sorting can be 
achieved by repeatedly swapping adjacent pairs a < b;c < d in the word, where a > c. The 
swap abcd +> cdab can be achieved by applying the two transpositions (a,c) and (b,d) on 
the left. So this modification of u* does not change sgn(u*), nor does it affect the product 
of the factors A(uj,uj,,) (since multiplication is commutative). Similarly, we can sort the 
pairs in v* to obtain v without changing the formula. We conclude finally that 


N 
sen(w) |] AG, w(@)) = sen(u*)sen(v*) [] [A(uj, wi.) A(o?, v8.1) 


i odd 


I 


sen(u)sgn(v) [| [Aw uipi)A(vi, vi41)]- O 


i odd 
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The following example illustrates the calculations at the end of the preceding proof. 


12.84. Example. Suppose w = (3,8)(11, 4, 2,9)(1, 10,6, 7,5, 12) € S€¥, sok = cyc(w) = 3. 
We begin by writing the standard cycle notation for w: 


w = (1, 10,6, 7,5, 12)(2, 9, 11, 4)(3, 8). 
Next we set 
u* = 1,10;6,7;5, 12; 2,9; 11,4; 3,8; v* = 10,6; 7,5; 12,1; 9,11; 4, 2; 8,3. 


Observe that inv(v*) = inv(u*) + (12 — 3) due to the cyclic shifting of 1,2,3, so that 
sen(u*) sen(v*) = (—1)'2-8 = sgn(w). Now we modify u* and v* so that the elements in 
each pair increase: 


u’ = 1,10; 6,7; 5,12; 2,9; 4, 11; 3, 8; v’ = 6,10;5, 7; 1,12; 9, 11; 2, 4; 3,8. 


Note that sgn(u’) = — sgn(u*) since we switched 11 and 4, but this is offset by the fact that 
A(11,4) = —A(4,11). So sgn(u*) TJ, A(uf, ui,,) = sgn(u’) TJ; A(wj, ui,1), and similarly for 
v* and v’. Finally, we sort the pairs so that the minimum elements increase, obtaining 


u = 1,10; 2,9; 3,8; 4, 11; 5, 12; 6, 7; v = 1,12;2, 4; 3, 8;5,7;6, 10; 9,11. 


This sorting does not introduce any further sign changes, so we have successfully transformed 
the term indexed by w in det(A) to the term indexed by (u,v) in Pf(A)?. 


DT 


12.13 Domino Tilings of Rectangles 


This section presents P. W. Kasteleyn’s proof of a formula for the number of ways to tile a 
rectangle with dominos. Let Dom(m,n) be the set of domino tilings of a rectangle of width 
m and height n. This set is empty if m and n are both odd, so we will assume throughout 
that m is even. Given a tiling T € Dom(m,n), let Np(T) and N,(T) be the number of 
horizontal and vertical dominos (respectively) appearing in T. Define the weight of the 
tiling T to be wt(T) = aN yNo), 


12.85. Theorem: Domino Tiling Formula. For all even m > 1 and all n> 1, 


m/2n 


— gmn/2 2 eoe2 (IT 2ene2 { *™ 
Swit (T) =2 IL [1] 1/22 0s (4) +s cos (= ; (12.12) 


TEDom(m,n) j=l k=1 


By setting « = y = 1, we obtain the expression for | Dom(m, n)| stated in the Introduc- 
tion. 


Step 1: Conversion to a Perfect Matching Problem. Introduce a simple graph 
G(m,n) with vertex set V = {1,2,...,mmn} and edge set FE = E, U E,, where 
E,={{k,k+1}:k4#0 (modm)}, E,={{k,k+m}:1<k<m(n-1)}. 


This graph models an m x n rectangle R, as follows. The unit square in the ith row from 
the bottom and the jth column from the left in R corresponds to the vertex (¢—1)m+ J, for 
1<i<nand1<¥j<~m. There is an edge in F, for each pair of two horizontally adjacent 
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17 18 19 20 
17 |18 |19 |20 
13 |14 |15 |16 13 14 15 16 
9 |/10}11 [12 —> 
5 7 |8 9 10 11 12 
1 3 oe he ou aw 
R 5 6 7 8 
C—O 
1 2 3 4 
G(4,5) 


FIGURE 12.15 
Graph used to model domino tilings. 


G(4,5) 


FIGURE 12.16 
A domino tiling and a perfect matching. 


squares in R, and there is an edge in E, for each pair of two vertically adjacent squares 
in R. There is a bijection between the set Dom(m,n) of domino tilings of R and the set 
PM(G(m, n)) of perfect matchings of G(m,n). Given a domino tiling, one need only replace 
each domino covering two adjacent squares by the edge associated to these two squares. 
This does give a perfect matching, since each square is covered by exactly one domino. If a 
tiling T corresponds to a matching M under this bijection, we have N;(T) = |Mn Ez| and 
N.(T) =|MN E,|. So, defining wt(M) = 2!@FelylMOFs| | we have 


Ss" owt(T) = = wt(M). 
TEDom(m,n) MEPM(G(m,n)) 


12.86. Example. Figure 12.15 shows the rectangle R and associated graph G(m,n) when 
m = 4 and n = 5. Figure 12.16 shows a domino tiling of R and the associated perfect 
matching. The tiling and matching shown both have weight x*y°. 


Step 2: Enumeration via Pfaffians. Let X, be the skew-symmetric matrix defined 
in 12.81, taking G there to be G(m,n). We know that 


os sen(M) |] 2:5 =Pf(X). (12.13) 


M€EPM(G(m,n)) {i<j}eM 
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0 x 0 0 -y O 0 0 0 0 0 0 
-x 0 x 0 0 y 0 0 0 0 0 0 
0 -ax O x 0 0 -y O 0 0 0 O 
0 0 -a O 0 0 0 y 0 0 0 0 
y 0 0 0 0 x 0 0 -y O 0 O 
0 -y O 0 -a O x 0) 0 y 0 O 
0 0 y 0 0 -a 0 x 0 0 -y O 
0 0 0 -y O 0 -x O 0 0 0 y 
0 0 0 0 y 0 0 0 0 x 0 O 
0 0 0 0 0 -y O 0 -ax O xz 0 
0 0 0 0 0 0 y 0 0 -x O @& 
0 0 0 0 0 0 0 -y O 0 -a O 


FIGURE 12.17 
Matrix used to enumerate domino tilings (m = 4,n = 3). 


We introduce the terms horizontal edge, odd vertical edge, and even vertical edge to refer 

(respectively) to edges in E,, edges {k,k + m} in E, with k odd, and edges {k,k + m} 

in Ey with k even. Consider the evaluation homomorphism (see 7.102) that sends x;,; to 

x if {t,7} is a horizontal edge, sends 2;,; to y if {7,7} is an even vertical edge, and sends 

xij to —y if {2,7} is an odd vertical edge. Let X be the matrix obtained by applying this 

homomorphism to each entry of the matrix X,. Explicitly, X is the mn x mn matrix with 
entries 

x ifj=i+landi#0 (mod m) 

y iff =i+mandi=0 (mod 2) 

-y iff =i+mandi=1 (mod 2) 

X(i,j)=¢ -x ift=j+landj #0 (mod m) 

-y ift=j+mandj=0 (mod 2) 

y ift=j+mandj=1 (mod 2) 

0 otherwise. 


(12.14) 


For example, the matrix X when m = 4 and n = 3 appears in Figure 12.17. Let sgn*(M) = 
sen(M)(—1)', where t is the number of odd vertical edges in M. Applying the evaluation 
homomorphism to each side of (12.13) gives 


S- sgn*(M) wt(M) = Pf(X). 
MEPM(G(m,n)) 


Step 3: Sign Analysis. The crucial fact to be verified is that sgn*(M) = +1 for every M. 
Before proving this fact, we consider an example. 


12.87. Example. Consider the following domino tiling of a 16 x 4 rectangle: 


4 
3 
2 
1 


123 4 5 6 7 8 9 10 11 12 13 14 15 16 
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This tiling corresponds to a perfect matching M of G(16,4), which is encoded (via 12.80) 
by a word w € SPf¢4. By definition, sgn(M) = (—1)"™). In our example, the word of M 
is 


w = 1,2;3,4;5,21;6,7;8, 24; 9, 25; 10, 11; 12, 13; 14, 15; 16, 32; 
17, 33; 18, 19; 20, 36; 22, 38; 23, 39; 26, 27; 28, 44; 29, 30; 31, 47;.. .; 60, 61; 62, 63. 


Note that w consists of pairs of letters indicating the two squares occupied by each domino in 
the tiling. We imagine placing dominos on the board one at a time, in the order specified by 
w, and updating sgn(M/) and sgn*(/) as we go along. When computing inv(w), the second 
symbol in each pair sometimes causes inversions with symbols following it in w. Pairs 
corresponding to horizontal dominos never cause any inversions. Consider the inversions 
caused by a vertical domino (i.e., a vertical edge in M). The first vertical edge appearing in 
w is {5,21}. The 21 is greater than the fifteen symbols 6,7,...,20 corresponding to squares 
to the right of column 5 in row 1 and squares to the left of column 5 in row 2, which have 
not been covered by a domino yet. So this edge increases inv(w) by 15 = m — 1, which 
causes a sign change in sgn(M). However, since this edge is an odd vertical edge, that sign 
change is counteracted in sgn*(M). 

The next vertical edge in w is {8,24}. The symbol 24 causes 14 = m — 2 new inversions, 
corresponding to squares to the right of column 8 in row 1 and squares to the left of column 
8 in row 2, excluding column 5. These inversions do not change sgn(M/), and sgn*(M) is 
also unchanged since {8, 24} is an even vertical edge. 

Continuing similarly, we eventually come to the odd vertical edge {23, 39} in w. Recalling 
the order of domino placement, we see that the 39 causes inversions with the following nine 
symbols to its right in w: 37, 35, 34, 31, 30, 29, 28, 27, 26. Since nine is odd, we get a sign 
change in sgn(M), but this is counteracted in sgn*(M) since we have just added an odd 
vertical edge. After accounting for all the dominos, we find that indeed sgn*(M) = +1, 
since the insertion of each vertical domino never leads to a net sign change (see 12.170). 


Now we are ready to prove that sgn*(M) = +1 for a general M € G(m,n). Let w € 
SPfmn be the word encoding M. As in the example, we calculate sgn*(M) = (—1)=v(“) (—1)¢ 
incrementally by scanning the edges in w from left to right. Initially, before scanning any 
edges, this quantity is +1. Suppose the next edge in the scan is the horizontal edge {k, k+1}. 
By definition of w (see 12.80), & is the smallest symbol that has not appeared previously in 
w. So k and k +1 cannot cause any new inversions with symbols following them. Similarly, 
t (the number of odd vertical edges) does not increase when we scan this edge. So sgn*(M) 
is still +1 after scanning this edge. 

Before continuing, we need the following observation: for every row i > 1, the number 
of vertical dominos that start in row 7 and end in row i + 1 is even (possibly zero). This is 
proved by induction on 7. To prove the case i = 1, suppose there are a horizontal dominos 
in row 1. Then there must be m — 2a vertical dominos starting in row 1. This number is 
even, since m is even. Now assume the result holds in row 7 — 1. In row 2, suppose there are 
a horizontal dominos, 6 vertical dominos coming up from row 7 — 1, and c vertical dominos 
leading up into row i+ 1. Then c = m — 2a — b. Since m is even and (by hypothesis) b is 
even, c must also be even. 

Now suppose the next edge in the scan is a vertical edge {k,& +m} in column j that 
covers rows i and i+1 (so k = (i—1)m+ 7). As before, the symbol k causes no new inversions. 
Let us count the inversions in w between k +m and symbols to its right. There are m — 1 
symbols that might cause inversions with k + m, namely k+1,k4+2,...,4+(m-— 1), but 
some of these symbols may have already appeared in w. Specifically, if there are a vertical 
dominos covering rows 7 and 7+ 1 to the left of column j, and b vertical dominos covering 
rows i— 1 and ¢ to the right of column 7, then a+b of the symbols just mentioned will have 
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already appeared in w. So, the inclusion of the new edge increases inv(w) by (m—1)—(a+0). 
Now, let there be b’ vertical dominos covering rows i — 1 and i to the left of column 7, and 
c horizontal dominos in row i to the left of column j. Since m — 1 = 1 (mod 2), -a=a 
(mod 2), —b = b’ (mod 2) (by the observation in the last paragraph), and 2c = 0 (mod 2), 
we see that 
(m—1)-a-—b=1+a+0b'+2c (mod 2). 

But 1+a+b!+2c= Jj since a+b’ + 2c counts all the columns left of column j in row i. We 
conclude, finally, that the increase in inv(w) caused by the insertion of the edge {k, k + m} 
has the same parity as the column index j. Since 7 and k have the same parity, the number 
of new inversions is odd iff the new vertical edge is an odd vertical edge. So there is no 
net change in sgn(M*) = (—1)v“)(—1)* when we add this edge. This completes the proof 
that sgn(M*) = +1. 


Step 4: Evaluation of the Pfaffian. Combining steps 1 through 3 and 12.83, we have 
S>owt(T) = S- wt(M) = Pf(X) = ./det(X), 
TEDom(m,n) MeEPM(G(m,n)) 


where X is the mn x mn matrix defined by (12.14). So we are reduced to evaluating the 
determinant of X. The idea is to replace X by a similar matrix U~1XU whose determinant 
is easier to evaluate. For this purpose, it is convenient to introduce tensor products of 
matrices. 


12.88. Definition: Tensor Product of Matrices. If A is any n x n matrix and B is any 
m x m matrix, let A ® B be the mn x mn matrix given in block form by 


41B a2pB +++ ay,B 
sgpa| mF M28 ean 
QniBo adn2zB +++ annB 


Formally, (A @ B)(m(t1 = 1) + to, M(t == 1) + je) = A(t1, j1) Bia, jo) for all 1 < 41571 < n 
and all 1 < 42, J2 < m. 


The following properties of tensor products may be routinely verified: 

a) (Ay + Ao) ®B = (Ay @ B) + (Ag @ B) and A® (Bi + Bo) = (A @ By) + (A @ By). 
b) For any scalar c, (cA) ® B=c(A® B) = A® (cB). 

c) (Ay ® By)(Ao ® Bo) = (A; Aa) ® (B, Bo). 

d) If A and B are invertible, then (A@ B)~! = A7'@ B"}. 

For every k > 1, let J, denote the k x k identity matrix, let Fy, denote the k x k diagonal 
matrix with diagonal entries —1,1,—1,1,...,(—1)*, let Ij, denote the k x k matrix with 1’s 
on the antidiagonal, and let Q; denote the k x k matrix with ones on the diagonal above 
the main diagonal, —1’s on the diagonal below the main diagonal, and zeroes elsewhere. For 
example, 


( 
( 
( 
( 


1000 0 Ae SO: OO FQ 
010 0 0 0 1 0 0 0 
i=|00100|, B=] 0 0-10 0 |, 
O70 1-0 + @- Ge -a. 1G 
0" 20. Oe 4 Or. 0-0. 0. Si 
C20 00: a 0 1 0 0 0 
00010 sik 20." “iis Agu TG 
m=|/00100], Q=] 0 -1 0 1 0 
01000 O. SO. 28k “Oy 41 
10°00 0» “6. 2 Sn 6 


544 Buective Combinatorics 


xr, 0) 0 —Yys1 0 0 0 0 0 0 0 0 
0 arg  —YSs1 0 0 0 0 0 0 0 0 0 
0 —ys1 -£rs3 0 0 0 0 0 0 0 0 0 
—YS1 0 0 xra 0 0 0 0 0 0 0 0 
0 0 0 0 rr 0 0 —ys2 0 0 0 0 
24 0 0 0 0 0 rr2 —Yyse2 0 0 0 0 0 
0 0 0 0 0 —ys2. -&Tr3 0 0 0 0 0 
0 0 0 0 —Yys2 0 0 rra 0 0 0 0 

0 0 0 0 0 0 0 0 Lr1 0 0 —YS83 
0 0 0 0 0 0 0 0 0 zr. —Yys3 0 
0 0 0 0 0 0 0) 0 0 —Ys3 -£r3 0) 

0 0) 0 0 0 0 0 0 —Ys3 0 0 rr 


FIGURE 12.18 
Transformed matrix U~!XU for m = 4, n = 3. (Here rg = 2icos(ma/5) and sp = 
2i cos(7b/4).) 


The definition of X in (12.14) can now be written 
X =2(In @ Qm) + y(Qn ® Fm). 


(Compare to Figure 12.17.) The following lemma can be established by routine calculations, 
which we leave to the reader. 


12.89. Lemma: Eigenvectors of Q;. For0<a<k+1and1<6<k, define complex 


numbers b b 
a Ta = 7 
U,(a, 0) = 7% sin (4) , Ax (b) = 2icos (=) 4 


For 1 <a,b<k, we have 
U;,(a + 1, b) — Uz, (a — 1, b) = Ax, (b)U;, (a, 5). 


Therefore, the column vector (Ux (1, 6), Ux(2, b),..., Ug (a, b))* is an eigenvector of Qs asso- 
ciated to the eigenvalue A, (b). Letting Ux = (Ux(a, b) )i<ao<p and Dz be the k x k diagonal 
matrix with diagonal entries A,(b), we have Q,U, = U;,D x. Furthermore, (—1)*Ux(a, b) = 
—U;,(a,k+1-— 6) for 1 <a,b<k, and therefore F,U;, = —Upl;,. 


The columns of U; are linearly independent, because they are eigenvectors of Q, associ- 
ated to distinct eigenvalues. Therefore, U; is invertible, so the lemma gives U, 10,U; p= Dy 
and U,'F,U, = —Ij,. Let U = U, ® Um, so U-! = U7! @U;;!. Using properties of tensor 
products, we calculate 


UtXU = aU,” ® Un, )Un @ Qm)(Un ® Um) + y(U;," ® U7, )(Qn ® Fm)(Un ® Um) 


a(UzI,Un) ® (Un QmUm) + y(n! QnUn) @ (Uz) FnUm) 
= lls & Dy») _ y(Dn & ee 


I 


For example, if X is the matrix shown in Figure 12.17, then U~!XU is the matrix shown in 
Figure 12.18. In general, U-! XU is a block-diagonal matrix consisting of n m x m blocks. 
The bth block has entries —yA,,(b) on the anti-diagonal and entries %\,;,(a) (for 1 < a < m) 
on the diagonal. Now, since m is even, we can reorder the rows and columns of each block 


Additional Topics 545 


into this order: 1, m,2,m—1,3,m—2,...,m/2,m/2+1. This reordering can be accomplished 
by performing an even number of row and column switches on U~!.XU, so the determinant 
does not change. The new matrix is also block-diagonal, consisting of (mn/2) 2 x 2 blocks 
that look like 


tm (a) —yAn(0) 

<a< <b<n). 
—yAn(b) ®Am(m+1—-a) (lsasm/2,1sbsn) 

Now, Am(m + 1— a) = 2icos(z(m + 1 — a)/(m+4 1)) = —2icos(ma/(m + 1)) = —Am(a). It 

follows that the determinant of the 2 x 2 block just mentioned is 


b 
=" Xm (a)? — y7An(b)? =4 [2 cos” (=) + y? cos” (=) : 
Finally, det(X) = det(U~1XU) is the product of these determinants as a ranges from 1 to 
m/2 and 6 ranges from 1 to n. Taking the square root of det(X) and factoring out powers 
of 2 produces formula (12.12). 


DS 


Summary 


Rational-Slope Dyck Paths. If gcd(r, s) = 1, then the number of lattice paths from (0, 0) 
to (r,s) that never go below the line sz = ry is ot ae For any lattice path ending 


at (r,s), the r +s cyclic shifts of this path are all distinct, and exactly one of them is 
an r/s-Dyck path. 


Chung-Feller Theorem. A path from (0,0) to (n,n) has & flaws iff the path has & north 
steps starting below y = x. For 0 < k < n, there are C,, = wa (7%) aths ending at 
(n,n) with k flaws. Thus the number of flaws in a random path is uniformly distributed 


on {0,1,2,...,n}. 


Rook-Equivalence of Ferrers Boards. For each integer partition pu, rp(s) is the number 
of ways to place k non-attacking rooks on F,, = dg(u), and Ry(z) = ypso re (wa. 
For all partitions py = (1 > w2 > ++: > pn > 0) andv = (4 > 12 > +++ > mm > O) 
with |u| = n = |v|, we have R, (x) = R_(x) iff the multisets [uj +i: 1 <i < nj and 
[vy; +i:1<%<nJ are equal. 


Parking Functions. A function f : {1,2,...,n} — {1,2,...,n} is a parking function iff 
|{a : f(x) < i}| >i for all i < n. There are (n + 1)"~} parking functions of order n. 
A bijection from parking functions to labeled Dyck paths is given by putting the labels 
{x : f(a) = 7%} in increasing order in column i for each 7. A bijection from labeled Dyck 
paths to trees is given by letting the children of a; be the labels in column 7 + 1, for all 
i > 0 (where ap = 0 and aj,...,@, are the labels from bottom to top). 


e Facts about Cyclic Groups. If G is a cyclic group of size n < oo, then G has a unique 
cyclic subgroup of size d for each divisor d of n, and these are all the subgroups of G. 
Any cyclic group of size d has ¢(d) generators, and hence n = )> divs o(d). If Gis a group 
of size n with at most one subgroup of size d for each divisor d of n, then G must be 
cyclic. Hence, any finite subgroup of the multiplicative group of a field is cyclic. 


Counting Irreducible Polynomials. The size of a finite field must be a prime power. 
For each prime power q, there exists a field F' with q elements, which is unique up 
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to isomorphism. For such a field F’, let I(n,q) be the number of monic irreducible 
polynomials of degree n in Fx]. Classifying elements in the field of size q” by their 
minimal polynomials in F'[z] gives q” = >~ a|n aL (d,q). Hence, by Mobius inversion, 
I(n,q) = quale q¢u(n/d) where pw is the Mébius function defined in 4.28. 


e Subspaces of Vector Spaces over Finite Fields. A d-dimensional vector space over a q- 
element field has size q’. The number of k-dimensional subspaces of an n-dimensional 
vector space over a q-element field is Lala: Each such subspace has a unique basis in 
reduced row-echelon form (RREF). The number of k x n RREF matrices with entries 


in a q-element field is thus (7 i 


e Combinatorial Meaning of Tangent and Secant Power Series. tana = Yo ,s9(an/n!)a”, 
where a, counts permutations w satisfying w; < wz > w3 < w4 >--+: > Wn; and secx = 
nso (bn/n!)x", where b, counts permutations w satisfying w; < we > w3 <-++ < Wn. 


e Tournaments. A tournament is a digraph with exactly one directed edge between each 
pair of distinct vertices. A tournament ¢ is transitive iff (u,v) € t and (v,w) € t always 
imply (u,w) € t iff ¢ contains no directed 3-cycle iff the outdegrees of the vertices of t 
are pairwise distinct. A sign-reversing involution exists that cancels all non-transitive 
tournaments, leading to this formula for the Vandermonde determinant: 


detaf“hcisen = sence) [Toad = [] @-a. 


WwESn k=1 L<aa<jon 


e Hook-Length Formula. For a partition \ with n boxes, the number of standard tableaux 
of shape A is n!/ TT cag) 2(c), where h(c) is the hook-length of cell c. This can be proved 
probabilistically by defining a random algorithm that generates each S € SYT(A) with 
probability |] ocgec,) h(e)/n!. To build S, start at a random cell in dg(A), then repeatedly 
jump to a random cell in the hook of the current cell until reaching a corner. Place n in 
this corner and proceed recursively to fill the other cells in dg()). 


e Knuth Equivalence and Monotone Subsequences of Words. Two words v and w are Knuth 
equivalent iff v can be changed into w by a sequence of moves of the form ---yxrz--+ 
--yzu--+ (where « < y < 2) or ---azy--: +++ zay--: (where x < y < z). These 
moves simulate tableau insertion (when applied to reading words), so every w is Knuth 
equivalent to the reading word of its insertion tableau P(w). Words v and w are Knuth 
equivalent iff P(v) = P(w). If P(w) has shape \, then A1 +---+ A, is the maximum 
total length of a set of & disjoint weakly increasing subsequences of w, and \;, +---+ Aj, 
is the maximum total length of a set of & disjoint strictly decreasing subsequences of w. 


e Pfaffians. Let N be even. Given an N x N matrix A that is skew-symmetric (A’ = — A), 
the Pfaffian of A is 


Pf(A)= S> sgn(w) [] A(wi, wits), 


weESPfn 4 odd 


where w € SPfy iff w € Sn, we < wi4i, and w; < wi+2 for all odd i. We have 
det(A) = Pf(A)?. Each term of Pf(A) counts a signed, weighted perfect matching of a 
graph with vertex set {1,2,...,N}, where an edge from i to j (for 7 < 7) is weighted 
by A(i,7). There is a recursion Pf(A) = Yjeo(-1) ACA, J) Pf(A[[1, 7]]), where A[[1, 7] 
is the matrix obtained by deleting rows 1 and j and columns 1 and j from A. 
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e Domino Tilings. For all m,n € N* with m even, the coefficient of x%y? in 


m/2n % 
kr 
a2 TTT] ee a eee Ae 2 pet 
eae (a eae gl 


is the number of ways to tile an m x n board with a horizontal dominos and 6 vertical 
dominos. The steps in the proof are: (a) model domino tilings by perfect matchings of 
a grid-shaped graph; (b) use a Pfaffian to enumerate these signed perfect matchings; 
(c) adjust signs in the matrix so every perfect matching has sign +1; (d) rewrite the 
Pfaffian as the square root of the determinant of the matrix; (e) evaluate the determinant 
by performing a similarity transformation that nearly diagonalizes the matrix, creating 
2 x 2 blocks running down the diagonal. Each 2 x 2 block contributes one of the factors 
in the product formula above. 


SST 


Exercises 


12.90. Let ~ be the cyclic shift relation from §12.1. Find all the equivalence classes of ~ 
for: (a) the set of lattice paths ending at (3,4); (b) the set of lattice paths ending at (3,3). 


12.91. For v,w € R(N*E"), write v ~ w iff w can be obtained from v by a cyclic shift. 
Which of the following statements is always true for all r,s > 1? (a) Every equivalence class 
of ~ has size r + s. (b) Every equivalence class of ~ contains at least one r/s-Dyck path. 
(c) Every equivalence class of ~ contains at most one r/s-Dyck path. 


12.92. Let k > 0 and m > 1 be integers. Show that the number of lattice paths from (0,0) 
to (k + mh, h) that never go below the line x = k + my is 


k+(m+1)h k+(m+1)h 
ktmh,h "\ ba ma+l1h—1) 


Give a bijective proof analogous to the proof of 1.56 in §1.10. 


12.93. Verify the Chung-Feller theorem directly for n = 3 by drawing all lattice paths from 
(0,0) to (3,3) with: (a) 0 flaws; (b) 1 flaw; (c) 2 flaws; (d) 3 flaws. 


12.94. Let 7 be the Dyck path NNENEENNNENNENNEEENENEEE. Use the bijections 
from 12.4 to compute the associated lattice path with: (a) 5 flaws; (b) 8 flaws; (c) 10 flaws. 


12.95. For each flawed path 7, find the Dyck path associated to 7 via the bijections in 12.4: 
(a) NENNEEEENENNNEENEENENNNE; (b) NEEENNENEEENNNNNEENE. 


12.96. Let 7 be a random lattice path from (0,0) to (n,n), and for 1 < j < n, let 


X;(7) = x(a has a flaw in row j). 


Prove bijectively that P(X, = 0) = 1/2 = P(X, =1). 


12.97. Let X1, Xo,..., Xn be independent random variables such that P(X; = 1) = 1/2 = 
P(X; = 0) for all 7. (This means that, for all v1,...,Un € {0,1}, the events X1 = v1, X2 = 
U2,...,Xn = Un are independent in the sense of 1.84.) Compute P(X, + X2+---+X, =k) 
for 0 <k <n. Contrast your answer with the Chung-Feller theorem. 


548 Bijective Combinatorics 


12.98. Find a formula for the number of lattice paths from (0,0) to (n,n) with k flaws and 
j east steps departing from the line y = z. 


12.99. Compute the rook polynomial for each of the following partitions: 
(a) (3, 2,1); (b) (8,8, 8,8, 8, 8,8, 8); (¢) (); (d) (n,n, 1*). 


12.100. Draw the diagrams of all integer partitions of 8 and determine which pairs of 
partitions are rook-equivalent. 


12.101. Prove: for any integer partition yp, R,,(x) = Ry (x). 


12.102. (a) For any n > 1, prove that the partition ~ consisting of n copies of n is rook- 
equivalent to the partition v = (2n — 1,2n — 3,...,5,3,1). (b) Define a bijection between 
the set of non-attacking placements of k rooks on ys and the set of non-attacking placements 
of k rooks on v. 


12.103. Let y be an integer partition such that dg(w) C dg(Aw), where Ay = (N-1, N— 
2,...,3,2,1,0). Suppose the sequence (N — 1 — 1, N — 2 — pio,...,0 — un) has ax copies 
of k for k > 0. (Note that this sequence gives the row lengths of the skew shape Ay /,1.) 
Prove that the number of partitions that are rook-equivalent to su is 
Il ee + Ak — ‘) 

ap—-1—1,ax ) 


k>1 


12.104. Show that for each integer partition ys, there is a unique integer partition v with 
distinct parts that is rook-equivalent to p. 


12.105. Suppose si is an integer partition with dg(j) C dg(Aw), where Ay = (N-1,N— 
2,...,2,1,0). (a) Using a suitable involution, prove that 


k 
4=0 


where S(u,v) is a Stirling number of the second kind and e; is an elementary symmetric 
polynomial. (b) Deduce from (a) a combinatorial proof of part (d) of 2.77. (c) Deduce from 
(a) that the multiset condition in 12.10 is sufficient for R,(x) = R(x). (d) Assume ps and 
vy are rook-equivalent partitions. Use (a) and the Garsia-Milne involution principle 4.126 to 
construct a bijection from the set of non-attacking placements of k rooks on F,, to the set 
of non-attacking placements of k rooks on Fy. 


12.106. For each labeled Dyck path in Figure 12.9, compute the associated parking function 
and tree (see 12.21 and 12.22). 


12.107. (a) Convert the parking function f in Figure 12.5 to a labeled Dyck path and a tree. 
(b) Convert the labeled Dyck path NNENNEENEENENNEE with labels 5, 8,2,4,1,6,3,7 
(from bottom to top) to a parking function and a tree. (c) Convert the tree 


T = ({0,1,...10}, {{0, 9}, {5, 7}, {5, 8}, {9, 4}, (7, 6}, {6, 9}, {7, 10}, {10, 1}, (3, 9}, {2, 9}}) 


to a labeled Dyck path and a parking function. 


12.108. Suppose we represent a function f : {1,2,...,b} — {1,2,...,a@+1} as a labeled 
lattice path ending at (a,b). Find conditions on the labeled path that are equivalent to f 
being (a) surjective; (b) injective. 
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12.109. (a) Given nonnegative integers c,,...,Ca4+1 adding to b, how many labeled lattice 
paths from (0,0) to (a,b) have c; labels in column ? for all 2? (b) Use the bijections in §12.5 
to translate (a) into enumeration results for parking functions and trees. 


12.110. (a) Let p, be the number of parking functions of order n. Give a combinatorial 


proof of the recursion 
n 
n-1 
Pn = S- m(" = 1) PAP 


m=1 


(b) Use (a) and 3.186 to define a bijection between parking functions and trees. 


12.111. For a parking function f € Pp, let wt(f) = n(n + 1)/2-— 7, f(a). Let Pa(x) = 
FCP, z**(f), Prove the recursion 


12.112. Let S be a k-element subset of {1,2,...,n}. Prove that there are kn”~*—! parking 
functions f such that S = {x: f(x) = 1}. 


12.113. For each n,k,m € N, let Prizm be the set of labeled lattice paths ending at 
(k + mn,n) that never go below the line « = & + my. Find a recursion satisfied by the 
quantities |Prk,m|- 


12.114. Find a bijection between the set of parking functions of order n and the quotient 
group Z?,,/H, where H is the subgroup generated by (1,1,..., 1). 


12.115. How many generators does an infinite cyclic group have? 


12.116. Prove or disprove: if every proper subgroup of a finite group G is cyclic, then G 
itself must be cyclic. 


12.117. Suppose G is a group such that, for all d > 1, G has at most d elements x such 
that 27 = 1. Prove that every finite subgroup of G is cyclic. 


12.118. Describe all the finite subgroups of the field C. 


12.119. Quaternions. Let H be a four-dimensional real vector space with basis 1,7, 7, k. 
Define multiplication on H by letting 1 act as the identity, setting i? = j? = k? = —-1, 
ij =k = —jt, jk =i = —kj, ki = 7 = —ik, and extending by linearity. (a) Show that H 
with this multiplication is a division ring (i.e., H satisfies all the axioms in the definition 
of a field except commutativity of multiplication). (b) Find a non-cyclic finite subgroup of 
H* (cf. 12.29). (c) Show that the equation «? = —1 has infinitely many solutions in H*. 


12.120. Prove that the product of all the nonzero elements in a finite field F’ is —1p. 
Deduce Wilson’s theorem: for p prime, (p — 1)! = —1 (mod p). 


12.121. Compute the number of monic irreducible polynomials of degree 12 over a 9- 
element field. 


12.122. (a) Enumerate all the irreducible polynomials in Z2[] of degree at most 5. (b) Use 
the formula in 12.30 to compute I(n,2) for 1 <n < 8 (compare with the results in (a) for 
n <5). 
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12.123. Construction of Finite Fields. Let F be a field with q¢ elements, let h € F[z] 
be a fixed monic irreducible polynomial of degree n, and let 


K={f € Fla]: f =0 or deg(f) < n}. 


For f,g € K, define f + g to be the usual sum of polynomials in F'[x], and define f x g to 
be the remainder when fg is divided by h. Show that K, with these operations, is a field 
of size q”. The field K is denoted F'[a]/(h). 


12.124. Let K = Z[a]/(x? + x + 1) (see 12.123). Construct addition and multiplication 
tables for K. Explicitly confirm that K* is generated by x by computing x’ for 1 <i < 7. 


12.125. Let h = a4 +241 € Zola], and let K = Zo[x]/(h), which is a 16-element field 
(see 12.123). (a) Explain why every element y € K satisfies y© = y. (b) List all the elements 
of K and their minimal polynomials over Zg. (c) Factor the polynomial x'° — x € Ze[z] into 
a product of irreducible polynomials. (d) Explain the relation between part (b), part (c), 
and the formulas in 12.30. (e) Find all generators of the cyclic group K*. 


12.126. (a) Use 12.30 to show that I(n,q) > 0 for all prime powers gq and all n > 1. (b) 
Prove that for every prime power p”, there exists a field of size p”. 


12.127. Let F be a finite field of size g. A polynomial h € F'[z] is called primitive iff h is a 
monic irreducible polynomial such that x is a generator of the multiplicative group of the 
field K = F{a]/(h) (see 12.123). (a) Count the primitive polynomials of degree n in F [a]. 
(b) Give an example of an irreducible polynomial in Z2[2] that is not primitive. 


12.128. Let K be a q-element field. How many n x n matrices with entries in K are: (a) 
upper-triangular; (b) strictly upper-triangular (zeroes on the main diagonal); (c) unitrian- 
gular (ones on the main diagonal); (d) upper-triangular and invertible? 


12.129. How many 2 x 2 matrices with entries in a g-element field have determinant 1? 


12.130. Count the number of invertible n x n matrices with entries in a g-element field F. 
How is the answer related to [n]!q? 


12.131. How many 3-dimensional subspaces does the vector space Z? have? 


12.132. For each integer partition 4 that fits in a box with 2 rows and 3 columns, draw a 
picture of the RREF matrix associated to yz in the proof of 12.37. 


12.133. Find the RREF basis for the subspace of Z2 spanned by v; = (1,4,2,3,4), v2 = 
(2,3,1,0,0), and v3 = (0,0,3, 1,1). 


12.134. Find the RREF basis for the subspace of Z$ spanned by v1 = (0,1,1,1,1,0), 
vg = (1,1,1,0,1,1), and vs = (1,0,1,0,0,0). 


12.135. Let V be an n-dimensional vector space over a field K. A flag of subspaces of 
V is a chain of subspaces V = Vo D Vi D Vy D-::-: DV, = {0}. Suppose |K] = ¢. 
Given nj,...,n, and n = ny +---+n,, count the number of such flags in V such that 
dimx (V;-1) — dimgx (V;) = n; for 1 <i<s. 


12.136. (a) Give a linear-algebraic proof of the symmetry property [alg = [ ee when q 


is a prime power. (b) Explain how the equality of formal polynomials (7 ion ie 3 , can be 
deduced from (a). 
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12.137. Let V be an n-dimensional vector space over a q-element field, and let X be the 
poset of all subspaces of V, ordered by inclusion. Show that the Mobius function of X is 
given by px(W,Y) = (-1)4q4*-Y/2y(W CY), where d = dim(Y) — dim(W). (Use 6.61.) 


12.138. Use the recursions for a», and by, in §12.8 to verify the values in (12.2). 


12.139. Give probabilistic interpretations for the rational numbers appearing as coefficients 
in the Maclaurin series for tanz and sec. 


12.140. Fill in the details of Step 5 of the proof in 812.8. 


12.141. (a) List the permutations satisfying (12.3) for n = 1,3, 5. (b) List the permutations 
satisfying (12.4) for n = 0, 2,4. 


12.142. (a) Develop ranking and unranking algorithms for up-down permutations. (b) 
Unrank 147 to get an up-down permutation in $7. (c) Find the rank of 2,5,3,6,4,8,1,7 
among up-down permutations of length 8. 


12.143. Let (q;q)o = 1 and (q;q)n = (1—9)(1 — ¢’)---(1 — g”) for n > 1. Consider the 
following q-analogues of formal trigonometric functions: 


sing = So(- 


k>0 


g2ktl 2k 


cosy = ¥-(-)k ~— € Q@)[le]]; 


0 (4; Q)2k 


(qs q)on+1° 


tang = sing /cosg; secg = 1/ cosy. 


Define g-tangent numbers and q-secant numbers by tn = (3 ¢)n tang(n) € Q(g) and sy, = 
(43 dn Secq(n) € Q(g). (a) Show that for each n > 0, 


tn = S- ginny). Sn = S- gr), 
w satisfying (12.3) w satisfying (12.4) 
(b) Use (a) to conclude that t,, 8, € Nig]. Compute t,, for n = 1,3,5 and s,, for n = 0, 2,4. 


12.144. Let ¢t be the tournament with edge set 


{(2, 1), (1, 3), (4, Ly, (1, 5), (6, 1, (2, 3), (4, 2), (5, 2), (2, 6), 
(3,4), (3, 5), (6, 3), (4,5), (6, 4), (6, 5)}. 
Compute wt(t), inv(t), and sgn(t). Is t transitive? 
12.145. Let t be the tournament in 12.41 and J the involution used to prove 12.46. Compute 
t! = I(t), and verify directly that wt(t’) = wt(t), sgn(t’) = — sgn(t), and I(t’) = t. 
12.146. Use induction and 9.47 to give an algebraic proof of 12.46. 


12.147. Suppose xo, %1,...,2y are distinct elements of a field F’. State why the Vander- 
monde matrix ne Noeagen is invertible. Use this to prove the fact (asserted in 2.79) that 
if p € F[a] has degree at most N and satisfies p(#;) = 0 for 0 <i < N, then p must be the 
zero polynomial. 


12.148. A king in a tournament t is a vertex v from which every other vertex can be reached 
by following at most 2 directed edges. Show that every vertex of maximum outdegree in a 
tournament is a king; in particular, every tournament has a king. 


12.149. Use the hook-length formula to compute f* for the following shapes ): (a) (3, 2, 1); 
(b) (4, 4, 4); (c) (6, 3, 2, 2, 1, 1, Dy (d) (n, n— 1); (e) (a, i) 
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12.150. Show that f©) = 1 and, for all nonzero partitions \, f* = ay f' where we sum 
over all 4 that can be obtained from A by removing some corner square. Use this recursion 
to calculate f> for all \ with |\| < 6. 


12.151. (a) Develop ranking and unranking algorithms for standard tableaux of shape 
based on the recursion in 12.150. (b) Unrank 46 to get a standard tableau of shape (4,3, 1). 


(c) Rank the standard tableau [2/5]8 |. 
[6] 7] 


12.152. Enumerate all the hook walks for the shape A = (4,3, 2,1) that end in the corner 
cell (2,3), and compute the probability of each walk. Use this computation to verify 12.54 
in this case. 


12.153. Suppose \ € Par(p) where p is prime. (a) Show that p divides f> if \ is not a hook 
(see 10.3). (b) Compute f* mod p if \ is a hook. 


12.154. Does the hook-length formula extend to enumerate standard tableaux of skew 
shape? Either adapt the probabilistic proof to this situation, or find the steps in the proof 
that cannot be generalized. 


12.155. Confirm that =p and =x are equivalence relations on X*, as asserted in $12.11. 


12.156. Let T be the tableau in 12.62. Find an explicit chain of elementary Knuth equiv- 
alences demonstrating that rw(T)1 =x rw(T < 1). 


12.157. Find the length of the longest increasing and decreasing subsequences of the word 
w = 4135321462731132423142. 


12.158. Complete the proofs of 12.64 and 12.65. 


12.159. For any semistandard tableau T, prove that P(rw(T)) = T. Show that the set of 
reading words of semistandard tableaux intersects every Knuth equivalence class in exactly 
one point. 


12.160. Prove 12.70 without using the RSK algorithm. 
12.161. Show that if A is an N x N skew-symmetric matrix with N odd, then det(A) = 0. 


12.162. Verify by direct calculation that det(A) = (af + cd — be)? for the 4 x 4 matrix A 
in 12.72. 


12.163. Find the Pfaffian of a general 6 x 6 skew-symmetric matrix. 
12.164. Count the number of perfect matchings for the graph shown in Figure 12.14. 
12.165. Let G be the simple graph with V(G) = {1,2,3,4,5,6} and 


E(G) = {{2, 3}, {3,4}, {4,5}, {2, 5}, {1 2h, (1, 5}, {3, 6}, (4, 6F, 12, 4b. 


Find all perfect matchings of G. Use this to compute )?yyepmq) S3n(M) wt(M), and verify 
your answer by evaluating a suitable Pfaffian. 


12.166. Compute the images of the following permutations w € Sh? under the bijec- 
tion S<? — SPfx used in the proof of 12.83: (a) w = (3,1,5,7)(2,4,8,6); (b) w = 
(1, 4)(2, 3)(5, 7)(6, 8); (c) w = (2,5, 1,6, 8, 4, 7,3); (d) w = (3, 2, 1,5, 6, 7)(4, 8). 
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12.167. Compute the images w of the following pairs (u,v) € SPf}, under the bijection 
SPfi, > S&? used in the proof of 12.83: (a) u = 13254768, v = 15283647; (b) u = 13254768, 
v = 12374856; (c) u = 15243867 = v. In each case, confirm that the term indexed by w in 
det(A) equals the term indexed by (u,v) in Pf(A)?. 


12.168. Compute the exact number of domino tilings of a 10 x 10 board and a 6 x 9 board. 


12.169. How many domino tilings of an 8 x 8 board use: (a) 24 horizontal dominos and 8 
vertical dominos; (b) 4 horizontal dominos and 28 vertical dominos? 


12.170. Complete 12.87 by writing out w in full and showing that the placement of every 
new domino never causes sgn*(I/) to become negative. 


12.171. Verify the four properties of tensor products of matrices stated just below 12.88. 
12.172. Prove 12.89. 


12.173. Let U;, be the matrix defined in 12.89. Show that \/2/(k + 1)U; is a unitary matrix 
(i.e., U-' = U*, where U* is the conjugate-transpose of U). 


12.174. (a) Prove that, for even m, 


m/2 Zym+1 _ fa, yo|mrrl 
Ul 1? ReGen a eee a cae lee 


(b) Deduce that 
m/2 


II 2 cos(jm/(m+1)) =1. 


j=l 
12.175. Show that formula (12.12) simplifies to 


m/2n/2 


mort TT fet (2) +vteet(2E)] (ween 


j=1 k=1 
m/2(n—1)/2 


m(n—1)/2,m/2 jn Qi ex 29 ka | 
2 I U a? cos ( 7) +u cos iS (n odd). 


Notes 


§12.1. Detailed treatments of the theory of lattice paths may be found in Mohanty and 
Narayana [94, 98]. §12.2. The Chung-Feller theorem was originally proved in Chung and 
Feller [25]; the bijective proof given here is due to Eu, Fu, and Yeh [35]. §12.3. There is a 
growing literature on rook theory; some of the early papers in this subject are [41, 55, 56, 74]. 
§12.4. More information about parking functions may be found in [39, 43, 81, 123]. §12.6. 
Expositions of field theory may be found in Hungerford [70] or Chapter 5 of Bourbaki [19]. 
An encyclopedic reference for the subject of finite fields is Lidl and Niederreiter [83]. §12.7. 
For more on Gaussian elimination and RREF matrices, see linear algebra texts such as 
Hoffman and Kunze [69]. §12.8. The combinatorial interpretation of the coefficients of the 
tangent and secant power series is due to André [2, 4]. For more information on g-analogues 
of the tangent and secant series, see [6, 7, 37]. §12.9. Moon [97] gives a thorough account 
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of tournaments. The combinatorial derivation of the Vandermonde determinant is due to 
Gessel [52]. §12.10. The probabilistic proof of the hook-length formula is due to Greene, 
Nijenhuis, and Wilf [62]. §12.11. A discussion of Knuth equivalence and its connection to 
the RSK correspondence appears in Knuth [77]. The theorem 12.65 on disjoint monotone 
subsequences was proved by Greene [61]; this generalizes Schensted’s original result [122] 
on the size of the longest increasing subsequence of a word. §12.13. Our treatment of the 
domino tiling formula closely follows the presentation in Kasteleyn’s original paper [75]. 


Answers and Hints to Selected Exercises 


1.88. (a) 200 + 142 — 28 = 314; (b) 314 — 28 = 286. 1.91. 2” — 2. 1.92. (a) n(n — 1). 
1.94. about 1.8 billion years. 1.95. (a) ("?) = 120; (b) 176. 1.96. (,°,)(;/5) = 210. 1.97. 
(ce) ($)5?21"-?. 1.99. n*/? (for k even); n*+)/? (for k odd). 1.100. (c) uv, uw, ux, uy, 
vu, VW, VX, VY, WU, WV, WX, Wy, XU, XV, XW, xy, yu, yv, yw, yx. 1.102. (c) [aaa], [aabl, 
[aac], [abb], [abc], [acc], [bbb], [bbe], [bec], [ccc]. 1.103. (a) (4), (3,1), (2,2), (2,1, 1), (1,3), 
(1, 2,1), (1,1, 2), (1,1, 1,1). 1.104. e.g., (2, 2,1) has picture[_] |] [_[_] [_]Jand word 0101. 
1.107. e.g., NNEEEN maps to NNEEEE; NEEENN maps to NEEENE; ENENNE maps to 
ENEEEN; etc. 1.109. (a) ('’) = 286; (b) 41°; (d) For (b), first compute the answer if there 
are only two children (cf. 1.91) or three children. This will be easier after Chapter 4. 1.110. 
Each positive divisor of n has a unique prime factorization of the form pt ee pit where 
0< f; <e; for each 7. The product rule gives (e; + 1)(e2 + 1)--- (ex +1) positive divisors 
of n. There are twice as many divisors in Z. 1.111. (c) Go) —1. 1.112. (a) e.g., ®(12) = 
{1,5,7, 11} and 4(12) = 4; (b) d(p) = p—1, since 1, 2,..., p—1 must be relatively prime to p. 
1.113. (a) Use ideas from the second proof of 1.58. 1.114. (c) Regard ‘PP’ as a single letter; 
then 1.46 gives lia) = 6300; (d) 1.113(a) can be useful. 1.116. (a) (eae 1.118. (a) 
6/36 = 1/6; (b) 5/36. 1.119. 0.54. 1.120. (b) 215/265 = 0.344. 1.121. (a) (‘P)/(8) = 
7/102 ~ 0.069. 1.122. (b) There are 2'° outcomes in the sample space. We can roll zero 
heads in one way, or exactly one head in ten ways. So the probability of at least two heads 
is (27° —1—10)/21° = 1013/1024 = 0.9893. 1.123. (, 9 9'.9.3) /6'° © 0.000042. 1.125. 0.16. 
1.126. (a) 4/5; (b) 8/15; (c) 1/2. 1.127. (d) No, since P(ANBND) = 0 # P(A)P(B)P(D). 
1.128. 100! has |100/5| + |100/25| = 24 trailing zeroes. 1.130. (a) 468,559; (b) 600, 000. 
1.131. (a) 2”*. 1.132. P(full house) is (12- (5) + (3) -12- (3) +12-(3)-11-(5))/(2) = 6/4165 = 
0.00144. 1.133. P(straight) is 10-8°/('2*) ~ 0.00356 (this includes straight flushes); P(five- 
of-a-kind) is 13 - (3)/(*2*) » 7.92 x 1076. 1.135. (a) 3('2)/(®) = 33/4921 ~ 0.0067. 
1.136. P(straight flush|K) = 27) ~ 8 x 10~°, so the event of getting a straight flush 
is not independent of K. P(four-of-a-kind|K) = (48 + 12)/(°!) = 1/4165, so the event 
of getting four-of-a-kind is independent of K. 1.137. (a) 52!/45! = 674, 274, 182, 400; (b) 
13. @) -7!/|S| = 1/595 = 0.00168. 1.139. Since A and B are nonempty, there exist a € A and 
b € B. Since A # B, there exists c with either c€ A andc g B, orc € Band c¢ A. In the 
first case, (c, b) isin Ax B but not Bx A. In the second case, (a,c) isin Ax B but not Bx A. 
Thus, Ax B 4 Bx A. 1.141. (b) As n tends to infinity with & held fixed, the probability that 
a random k-letter word using an n-letter alphabet has no repeated letters tends to 1. 1.143. 
(b) Since ANB = 6, (ANC)N(BNC) =9. Thus, P((AUB)NC) = P((ANC)U(BNC)) = 
P(ANC)+P(BNC) = P(A)P(C) + P(B)P(C) = (P(A) + P(B))P(C) = P(AUB)P(C). 
So AUB and C are independent. 1.144. (c) Assume f is injective and g,h: W — X satisfy 
fog=foh. Fix w € W. We know f(g(w)) = f(h(w)), hence g(w) = h(w). So g =h. For 
the converse, fix 71,%2 € X with f(x) = f(x2). Let W = {0} and define gh: W — X 
by g(0) = 21, R(O) = xg. Then fog = f oh since both functions map 0 to f(x). So 
g =h by hypothesis, which means 7; = 22. So f is injective. 1.146. (a) Prove the second 
inequality with a counting argument. (d) Such an algorithm requires at least cn log, n time 
steps, for some constant c. 1.147. Each weighing has three possible outcomes (left side 
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heavier, right side heavier, or two sides equal). Compare to 1.146. 1.149. Hint: compute 
f(a+1,b—1)—f (a,b) for b > 0. 1.150. (d) Use the fact that for a,b € N*, ged(a,b) = 1 iff 
there exist integers r,s with ar + bs = 1. (e) By (d), o(n) = Th o(p;*). Now use 1.112(c). 
1.151. Consider the quotient and remainder when an integer z is divided by n. 1.152. (b) 
Define g(((a, b), c)) = (a, (b,c)) fora e X,b EY, ce Z. Note g is a bijection, since g’ given 
by g’((a, (6, ¢))) = ((a,b), c) is a two-sided inverse. (c) Say |X| = x, |Y| = y, and |Z| = z. 
The existence of g in (b) shows |(X x Y) x Z| = |X x (Y x Z)], so repeated use of the 
product rule gives (xy)z = x(yz). 1.154. Show that the set {1 € X : a ¢ f(x)} © P(X) 
cannot be of the form f(z) for any z € X. 1.155. Show that each function f : N— “{0,1} 
cannot be surjective (cf. 1.154). 1.156. (a) It suffices to prove A= X ~ g|[Y ~ f [Al]. 


2.84. BAR+ BAT+BER+ BET+BUR+ BUT+CAR+CAT+CER+CET+CUR4 
CUT+HAR+HAT+HER+HET+HU R+HUT. 2.86. (5 5; 3) = 5040. 2.88. —160. 2.90. 
For the first identity, expand (—1+ 1)” using the binomial theorem. For a combinatorial 
proof, first move negative terms (indexed by odd k) to the other side of the equation. 2.92. 
m” /nl. 2.93. e.g., (5) = 8+28 = 36; (3) = 28+56 = 84; so (2) = (9) +(3) = 120. 2.95. e.g., 
T (2,7) = 27, T(8, 7) = 27+48 = 75. 2.97. 5524 (use a recursion). 2.98. 136 (draw a picture). 
2.100. Find a bijection between these partitions and Dyck paths. 2.102. Integer partitions 
of 8 into 3 parts are (6, 1,1), (5,2, 1), (4,3,1), (4, 2,2), and (3, 3,2), so p(8, 3) = 5. Using the 
recursion, p(8,3) = p(7, 2) + p(5,3) = p(6,1) + p(5, 2) + p(4, 2) + p(2,3) =--- = 5. 2.103. 
p(13) = p(12) + p(11) — p(8) — p(6) + p(1) = 101, p(14) = 135. 2.105. e.g., S(9,2) = 255, 
$(9,3) = 3025, S(9,4) = 7770, S(10,9) = 45. 2.106. B(9) = 21,147, B(10) = 115,975. 
2.107. e.g., s(8,1) = —5040, s(8,2) = 13,068, s(8,3) = —13, 132. 2.110. (a) 2”; (b) 
ar’-n. (ce) 2”°-"_ 2,113. (b) {{1,7}, (2,4, 6}, {3}, {5}}. 2.115. Use a recursion. 2.119. 
(a) Mapping wiw2--- wp to its reversal w,---w2w, defines a bijection from $73! to $29?. 
(c) Verify that the numbers |S3!?| satisfy the Catalan recursion by classifying w € $3! 
based on the index k for which wz, = 1. Such a w can be written w = w’lw”, where w’ 
is any 312-avoiding permutation of {2,3,...,k}, and w” is any 312-avoiding permutation 
of {k+1,...,n}. 2.120. (a) Given (go,---,;9n—1) € Gn, replace go by an N and, for 1 < 
i <n, replace g; by gi-1 +1— gq; E’s followed by an N. Add gn_1 + 1 E’s at the end. 
Check that this gives a bijection from G,, to the set of Dyck paths of order n. 2.121. 
(a) NNNEENNEENEENNENEENNEE; (c) 3254618971110. 2.124. (a) To build a set 
partition of {1,2,...,n} into 2 blocks, choose any subset U of {1,2,...,n — 1} except 0 
to be one block, and let the other block be {1,2,...,n} ~ U. So S(n,2) = 2™-1+—- 1. 
(b) A surjection from an n-element set to an n-element set is automatically a bijection, 
by 1.27. So Surj(n,n) = n! by 1.28. 2.127. Find a recursion for f(n, a,b), the number of 
paths of length n from (0,0) to (a,b). Use a computer to obtain f(10,0,0). 2.129. (a) 
s'(u,k) = Diet eect nen Mis Mig ** Ming (c) s’(u,0) = s’(u,1) = 0, s’(u,2) = 360, 
s'(ut,3) = 717, etc. 2.130. (b) Let gn, be the number of subsets of {1,2,...,n} that do 
not contain two consecutive integers. Check that go = 1 = fo and gi = 2 = fi. To see 
Gn = 9n—-1 + Gn—2 for n > 2, consider whether n does not or does belong to the subset 
counted by gn; in the latter case, n — 1 cannot also appear in the subset. Induction now 
shows gn = fn for n > 0. 2.131. (a) Note fon = fon—1+ fon—2 = (fon—2 + fon—3) + fon—2- 
Now use fon—2 = fon—3+ fen—a to eliminate fo,_3. Initial conditions are ag = 1 and a; = 3. 
2.133. (b) Divide by « — 1 and let n tend to infinity in (a) to get >” _) a” = 1/(1- 2) 
for |z| < 1. 2.134. (a) Use induction, noting that ¢ and ~ satisfy the equation x? = x +1. 
(b) For an algebraic proof, use (a) and 2.133. 2.135. Hint: For any function f : X — Y, 
{(a,6) € X x X : f(a) = f(b)} is an equivalence relation on X. 2.136. Given a path 7 
counted by C,,~, choose r so that m arrives at (n — k — r,n — k) by a north step. 2.137. 
One approach to the combinatorial proof is to use 12.1, noting that gcd(k, p — k) must be 
1. 2.138. This method of proof is due to Leibniz. 2.139. (c) 1, 1, 3, 18, 75, 541. 2.140. 
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(a) By (0) = 1, By(1) = 0, Bi(n) = ie (je- Bi(n—1—k); so By (2) = 1, By (3) = 1, 
B,(4) = 4, By(5) = 11, By (6 ) = 41. 2.141. pa(n,k) = 0 forn <Oork <Oork > 7; 
forn>1,1<k<n. 2.145. (a) Fix v e W. Write B = (v1,...,0n), C = (wi,..., Wn), 


[ule = (81,---,8n)7, and [v]c = Ss ..+5Tn)?. On one hand, v = >>, r;w;. On the other 


hand, v = a, $j0j = ay $j) 040i = ay, yo, saw = do; Oe aij5)) w;. Equating 
the coefficients of w; gives rj = )/, aij; for all i, which says [vlc = Alv]p. 2.150. (a) 
To prove identities involving © or @, these facts can be helpful: (i) cmodn = c— qn 
for some q € Z; (ii) for u,v € Zn, u = v if n divides u — v. For associativity of @, note 
(4@b)®c = (ab—qn) ®c = (ab—qn)c—rn = (ab)c+n(—qce—r) for some g,r € Z. Similarly, 
a®(b@c) =a(bc) +n(—sa—t) for some s,t € Z. As (a®b) @c€ Z, and a® (b@c) € Zp, 
and their difference is divisible by n, these elements are equal. 2.151. |M,,(R)| = IRI”. 
2.153. (a) To prove the right distributive law in R, fix f,g,h € R and n © Z. Check that 
both (f @ g) oh and (f oh) @ (goh) send n to f(h(n)) + g(h(n)). (b) Be sure to check 
that S is closed under @ and o. 2.154. In the induction step, reindex the summations and 
use 2.25. 2.156. (b) D"( fife: fs) = Dn t-tnsan (ny, sng) D™ (f1)D™ fa)» D™ (fs). 


2.157. (c) The degree 2 polynomial x? — 1 € Zg{z] has four roots in Zg (1, 3, 5, and 7). 
2.158. Use induction on n and 2.157(b). You may need 7.44 also. 2.159. (b) Use 2.158. 


2.160. |A,,| = Cy, (see §12.2). 2.161. see [22]. 
QD 


3.124. (2)® © ®© (e) O—@—®—®— ::: —®@ _ 3.125. (b) Choose to in- 
clude or exclude each of the n? ordered pairs (v,w) in the edge set of the simple digraph. 
By the product rule, the answer is 2" 3.127. The bijection maps (V, £) to the relation 
RCV-xV such that for u,v € V, (u,v) € R iff {u,v} € E. 3.129. There are eleven 


| o 111 
isomorphism classes for four-vertex simple graphs. 3.130. (a) ; ; ;(g) annxn 
1 0 0 0 


matrix with zeroes on the main diagonal and ones elsewhere. 3.131. For 1 < k < 5, 
answers are 1, 1, 7, 9, and 57. 3.132. There are 64 walks from 0 to 0, 36 from 0 to 
1, 120 from 0 to 2, 76 from 1 to 2, etc. 3.134. (a) Use induction on the length of the 
walk. (b) No; consider a graph with a single edge. 3.135. (a) if i = j, zero; if i # J, 
Sei Ali K)A(K, 9) = A? (i, 3) — AG, 7)(A(G, @) + AU, J). 3-136. (b) 8. 3.138. (a) 8 paths; 
(b) 13 paths. 3.139. 2"~? for n > 2. 3.140. S77_, k!S(n,k), the number of ordered set 
partitions of {1,2,...,n} (see 2.139). 3.142. (a) deg(G) = [2,2,2,2,2,2,3,3,3,4,4,5, 6]; 
sum of degrees is 40 = 2-20 = 2|E|. 3.144. (a) deg(C,,) = [2,2,...,2] where 2 occurs n 
times; (c) (n — 1)!/2; (d) p(n), the number of integer partitions of n. 3.146. The smallest 
example has five vertices. 3.147. The statement is false. 3.148. Find a three-vertex exam- 
ple. 3.149. Note )).¢_ M(v,e) = degg(v) for v € V and Do ».cy M(v,e) = 2 fore € E. 
3.150. (b) Digraph has edges (0,1), (1,2), (2,5), (3,3), (4,3), (5,5), (6,2). C = {3,5}, 
S3 = {3,4}, and Ss; = {0,1,2,5,6}. 3.152. Study the walk through the functional digraph 
of f determined by the sequence (x; : i > 0). 3.153. (a) With the notation of 3.152, the 
algorithm computes gcd(a2; — 2;, N) for i = 1,2,... until this gcd is not 1. For some i > 0, 
XQ; = x;, so the gcd for this i is N, and the algorithm terminates. However, the algorithm 
may terminate sooner if gcd(v — u, N) is a proper divisor of N. (b) For N = 77: in step 1, 
(u,v, d) = (1,2,1). In step 2, (u,v, d) is (2, 26,1), then (5,26,7). So 7 is a divisor. 3.154. 
(a) (k—1)(k—2)---(k—s)/k®. (b) Observe that, if o,...,2;—1 are all distinct and i > Vk, 
then the probability that S = i is at least 1/Vk. (c) N has a prime divisor p < VN. 
Consider Y = {0,1,...,p — 1} and g : Y — Y given by g(y) = (y? + 1) mod p. Assume 
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g behaves approximately like a random function. 3.156. (a) 1960. 3.158. n!/]J,(ai!i%). 
3.161. The function sends 1,2,...,17 to 1, 12, 3, 4, 10, 17, 15, 8, 3, 3, 12, 1, 4, 10, 1, 4, 17, 
respectively. 3.163. (1,a,3, f,5,m,2,k,4). 3.164. Each cycle is a strong component; and 
each v not on a cycle is in a component by itself. 3.168. 38. 3.170.(a) A walk of odd length 
in G that starts in A (resp. B) must end in B (resp. A), so could not be a cycle. 3.171. 
2” 3.172. (a) kn/2 (by 3.34); (b) argue that G has k|A| edges and also k|B| edges. 3.173. 
The statement is false. 3.175. 2(n — k). 3.179. The case where |V(T;)| = 1 for some i can 
be handled directly. Otherwise, use induction and pruning. 3.180. Use induction on m and 
pruning. (The result is due to Smolenskii.) 3.181. 1, 1, 1, 2, 3, 6, 11. 3.182. (a) p(n, 3). 
3.183. n!S(n — 2,n—k)/k! (see [112]). 3.184. (b) (n—2)n"~3. 3.186. In an n-vertex tree, 
delete the unique edge incident to the largest vertex that leads towards the smallest vertex. 
3.187. (a) 2274547; (b) edges are {8,1}, {6,3}, {4,5}, {2,5}, {5,1}, {1,7}, {7,3}, {3, Of. 
3.190. (a) 8580. 3.192. Use 3.91 with n = 1, ko =m, ki = 0, kg = m-—1, s = 2m—1 to get 
a (Paka = Cym—1 parenthesizations. 3.194. (a) a(x —1)%; (c) a(a@—1)(a— 2)?. 3.197. 
(a) 4; (b) 108,000. 3.198. (a) 7P_, (7)2**-Y/?; (b) 2" — 1. 3.203. Let X be the set of 
pairs (T,e) where T is a spanning tree of G,_1 and e is one of the k edges of the rightmost 
k-gon in G,. Map X bijectively to the set of trees that span G,_2 or Gy. Initial conditions 
are T(Go) = 1 and 7(G,) = k. 3.204. (a) Let X be the set of pairs (e,v) where e € E(G) 
and v € V(G) is not an endpoint of e. Choose e and then v to see |X| = |E(G)|(n — 2). 
Choose v and then e to see |X| = )),<y(qy |E(G ~ v)|. 3.205. (a) 1 tree (the whole graph), 


1 0 0 20 1 0 
and det | 0 1 0 | = 1. (c) 8 trees, and det | -1 3 -l | = 8. 3.208. The determinant 
0 0 1 0 -1 2 


is the characteristic polynomial of uJ. What are the minimal polynomial and eigenvalues 
(with multiplicities) of uJ? 3.209. Use 3.208 with m = n—1, t = n, and u = 1. 3.210. 
3.208 can be useful here. 3.212. (a) G is connected and every vertex has even degree. 3.214. 
Construct a digraph with vertex set A’! and labeled edges y---yp—-1 — Y2°++ YR—12 for 
each z € A. Argue that this digraph has a closed Eulerian tour, and use this to construct 
the word w. 3.215. (a) 10 vertices, 15 edges; (d) No, since G has cycles of odd length. 
3.217. Use the product rule and 3.215(c),(e) to show each e belongs to four 5-cycles. So G 
has 15- 4/5 = 12 5-cycles. 3.218. (a) Define a bijection between the set of such cycles and 
V(G). In the second picture of G shown in 3.215(b), the outer cycle (A, B,C, D, E, F, A) 
maps to the central vertex J. (b) 10. 


4.60. (a) |SUT| = 154+ 13-6 = 22; (b) |SUTUU| = 154+ 184+12-6-3-4+ 
1 = 28; (c) 17. 4.62. 2143, 2341, 2413, 3142, 3412, 3421, 4123, 4312, 4321. 4.63. (d) 
dig = 10dy + 1 = 1,334, 961. 4.64. (b) (11) = 10, (11) = —1, 7(11) = 2, o(11) = 12. (c) 
(28) = 12, (28) = 0, 7(28) = 6, o(28) = 56. 4.65. (a) 1+14+24+24244+4+4+4+8=24 
and 24—12—8+4 = 8 = ¢(24). 4.67. (7) — 4(®) + 6(7°) — 4('3) = 685,464. 4.71. 
m(n—1)!S(m—1,n—1). 4.72. (a) 24-423 4 6x? — 32. 4.74. (a) If m has prime factorization 
I], p;', then o,(m) = J], - pre) 1(4 —p*). (b) Use 4.30. 4.76. Writing permutations 
in one-line form, 1234 pairs with 2134, 1243 pairs with 2148, 1324 pairs with 2314, ..., and 
4321 pairs with 4312. 4.78. w is a 6-cycle (1, 4, 2,3,6,5). If we splice 7 in just before the 2 
(say), we get the 7-cycle (1, 4,7, 2,3, 6,5), which is 4367152 € Dz. If we pair 8 with 5 (say) 
in a 2-cycle and relabel w to be (1,4,2,3,7,6) we get 43728165 € Dg. 4.82. (a) Expand 
(2—1)" = 1 using the binomial theorem. (b) Consider n-letter words using letters {a, b,c}, 
with an appropriate definition of signs. 4.83. (a—b)”. 4.85. (a) In the case S # T, define an 
involution based on whether a fixed element of T ~ S is or is not in U. 4.87. (a) (3”— 1)"; 
(b) (8” — 2)"; (c) use inclusion-exclusion. 4.88. DB a a CP aa 4.92. (a) In 4.11, 
let S; be the set of words in R(1323 ---n3) that contain three i’s in a row. To compute (say) 
|S; M So|, think of 111 (and 222) as a single letter. 4.95. (a) Use 4.48, classifying signed 
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chains from x to z based on the next-to-last point y. 4.96. For x,y € X, define x < y iff 


there exist k > 0 and zo,...,2, € X such that ro = x, rz, = y, and (x;_1,2;) € R for 
a ie he Ae 1 -1 -1 -1 2 
0 t 0 0 21 0 1 0 0 -1 
1<i<k.4.97.(a)Z=]0 01 031/,M=Z-t=]0 0 1. O -1 J. (b) For 
0 0 0 1 1 0 O 0 1 —1 
0 0 0 0 1 0 O 0 0 1 


instance, p(a,e) = 2 since there are three positive chains of length 2 (through b, c, or d) and 
one negative chain (a, e) of length 1. 4.99. Use 4.95 and induction on the number of elements 
in the interval. 4.100. wx (a,b) = ux, (a,b) if a,b © Xi; x (a,b) = wx, (a,b) if a,b € Xo; 
and p.x(a,b) = 0 otherwise. 4.102. One can either invert Z using the algebra of block 
matrices, or count signed chains in X. 4.105. (b) Repeated use of 4.104 shows that the events 
X~ §$1,...,X ~ S, are independent. Since P(X ~ S;) = 1— p;, the desired probability is 
ITj_, (1 — pi). 4.106. The left side counts i-element subsets of {1,2,...,n} in which some 
elements in the subset have been marked with negative signs. Ifz > 0, take the least element 
in the subset and flip its status (marked or not). 4.109. n!. 4.111. A~1(i, 7) = (-1)*7 Ca 
4.113. 41,304. 4.115. Apply 4.11, letting S; be the set of compositions (a1,...,a%) with 
ay+-+:+a, = nand a; > m. 4.116. One might try inclusion-exclusion, but it may be easier 
to use 2.140. For m = 11 and n = 4, the answer is 1,367,520. 4.117. Try to generalize 
one of the proofs in §4.3. 4.118. (a) 7")'(—1)*24. (b) combinatorial proof: choose k of 
the n objects to be fixed points ((7) ways); then choose a derangement of the remaining 
n —k objects (d,_~, ways); now use the product rule. 4.119. 616. 4.120. For n > 0, the 
sum is (—1)"F,-1 — 1 (as can be proved by induction). 4.122. 1. 4.124. (—1)""!(n — 1)! 
4.126. Starting at xo € Fix(J), repeatedly apply f, then J, then f~!, then J, until some 
application of f leads to an element in Fix(J). Argue that this process must terminate and 
is a bijection. 4.127. Starting at z € B, keep applying g until C is reached. 4.128. See [28] 
for one approach. 4.129. See, e.g., [87], which provides an automatic method for converting 
a bijective proof of a matrix identity AB = I into a bijective proof of BA = I. 4.130. 
Careful analysis of the proofs of 4.24 and 4.25 leads to a recursively defined map. For other 
approaches, see [110, 138]. 4.131. (d) Use (c) and 4.95. The key fact, which can be proved 
by induction, is that wx, ({{1}, {2},..., {n}}, {{1, 2,...,n}}) = (-1)"-1(n — 1)!. See [127], 
Section 3.10, for a full discussion. 


5.45. 9+f : {a,b,c,d,e} > 5 sends a to 3, b to 4, c to 2, d to 1, ande to 0. 5.46. (a) 23; (c) 31; 
(ec) (4,2). 5.47. (b) (1,1, 1,0, 1); (c) 153. 5.48. (a) 91; (d) (2,0, 1,0). 5.49. (b) gotg1+92+93 
maps (0,0) to 0, (1,0) to 1, (2,0) to 2, (0,1) to 3, ..., (1,3) to 10, and (2,3) to 11. 5.50. (a) 
7944; (c) 299,140. 5.51. (a) good; (d) this. 5.53. (a) 12,129; (f) BNA. 5.54. (a) 11,553; (f) 
TNA. 5.55. Suggestion: use a space character. (b) 1875. 5.56. (a) Choose the digits from 
left to right to get 8-9-9-4 = 2592; (c) rank of 2500 is 504; (d) 2501 unranks to 9742. 5.57. 
(a) rank of LEVEL is 7561; (b) 12,662 unranks to STATS. 5.58. (a) 6,187,926. 5.60. (a) 
rank of bfdc is p6,5,4,3(1, 4, 2, 1) = 115. 5.61. (a) rank of 42153 is D5,4,3,2,1(3, 1, 0, 1; 0) = 79. 
(b) Ps, 4,3,2,1(46) = (1,3,2,0,0), so 46 unranks to the permutation 25413. 5.63. (a) rank 
of {b,c,d,h} is 38; (b) 40 unranks to f{a,c,e,h}. 5.65. (a) rank of bbccacba is 253; (b) 
206 unranks to bbabccac. 5.67. (a) 19,3((3,3,3)) = 0; (b) wi2,3(6) = (3,3,1,1,1, 1,1, 1). 
5.68. (4,4), (4,3,1), (4,2,2), (4,2,1,1), (4,1,1,1,1). 5.71. (a) rank of {{1,3}, {2,4,5}} 
is 13; (b) 247 unranks to {{1, 6}, {2,3}, {4,7}, {5}}. 5.73. (a) This hand is generated by 
the data (x, B,y, C) = (3, {@, >, 9}, 9, {0, @}), so the rank is p13,4,12,6(2, 0, 7,4) = 622. (b) 
ig 4.13-6(3082) = (10, 2,9, 4), giving data (x, B,y,C) = (J, {&, 9, @}, 10, {>, @}). The hand 
is {J&, JO, J&, 10, 10@}. 5.77. (a) NENENENE, NENENNEE, NENNEENE, NENNE- 
NEE, NENNNEEE, NNEENENE, NNEENNEE, NNENEENE, NNNEEENE, NNENE- 
NEE, NNENNEEE, NNNEENEE, NNNENEEE, NNNNEEFEE. 5.79. (a) T maps to the 
word 0005654100€ 11°, which has rank 900,350. 5.80. (a) ccbacbd; (c) 01101101. 5.81. 
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(a) ccbabcd; (c) 01100111. 5.82. First successor is NNNENEENNEENNNEEEE; first pre- 
decessor is NNNENEENNEENNEENEE. 5.84. For b < 0, apply 5.3 to a and |b|, and then 
replace q by —q. 5.85. Obtain an initial g and r using 5.84. If r is too big, modify it by 
adjusting q appropriately. 5.87. (b) Take F = Z, f = 3x, g = 2u. If f =qg+4-7r as in (a), 
then r and q must be integers and hence 2¢ = 3, which is impossible. (c) Yes, since the 
proof of (a) works if the leading coefficient of g has a multiplicative inverse in F’. 5.89. 
(a) (<=): Assume ged(s,t) = 1, so lem(s,t) = st. We show f is one-to-one. Let x,y € st 
with f(z) = f(y). Then 2 mod s = ymods and « modt = y modt, so s divides x — y 
and t divides x — y. Thus, x — y is a common multiple of s and ¢ strictly between —st 
and st. Since lem(s,t) = st, x — y = 0, so x = y. Since f is one-to-one, the image of 
f has size |st] = st = |s x t|. So f is a bijection. 5.91. Develop a recursive algorithm 
using the base-2 expansion of e and the identities 2? mod n = (a/ mod n)? mod n and 
x4+1 mod n = ((x?/ mod n) +x) mod n. 5.93. (b) Rename a, b,c, d,e to be 0,1, 2,3, 4. Us- 
ing the bijection in the second proof in §1.11, [b, b,c, d, d, d| maps to the subset {1, 2,4, 6, 7, 8} 
of {0,1,...,9}, which has rank 70. 132 unranks to {1,2,5,6, 7,9}, which maps to the mul- 
tiset [b, b,d,d,d,e]. 5.94. Let n tend to infinity in 5.22. 5.97. One approach is to use a 
bijection to map partitions of n with k distinct parts to partitions of n — (e with first part 
k, and then apply the algorithms in §5.8. 5.100. As seen in §1.13, a three-of-a-kind hand 
is uniquely determined from data (x, B,C, s1, 82), where x € Values, B is a three-element 
subset of Suits, C is a two-element subset of Values ~ {x}, and s1, s2 € Suits. Let the rank 
of the hand be p13,4,66,4,4(7(), r(B), ra (C), r(51), 7(S2)). 5.102. (a) The straight is deter- 
mined from the lowest value v (which is in {A,2,...,10}) and a word in Suits’. The given 


et at at Rea] 


Picta.a.4.4.a(1574) = (1,2, 0,2, 1,2), leading to the hand {20, 3, 49,50, 60}. 5.105. Rank- 
ing: Ifn ¢ S, let rn(S) = rn—1(S). If n € S, let rn(S) = fn—1 + Pn—2(S ~ {n}). Unranking: 
If0<k < fn-1, let rz4(k) = r71,(k). Otherwise, let rz1(k) = ro1y(k — fn—1) U {n}. 
5.106. (b) 412. 5.107. Successor of (9,3) is (10, 2). 5.108. Successor of (7, 4, 2, 1) is (7, 4, 3). 
5.110. (b) For one ordering (based on 5.30), the successor is {Jé, J, J@,9O, 990}. (c) 
{ Jd, JO, J&, 9, 9}. 5.113. Let n = ny +--+ + ng. To choose w = wi --- Wp, randomly 
choose w, to be a; with probability n;/n (1 < i < k). Decrement n; by 1, and recursively 
choose w2:++:Wn. 5.117. No. For instance, when n = 3, check that the probability of gen- 
erating w = 123 is 4/27 £ 1/6. 5.118. Yes (use induction on i). 5.120. Find a recursion 
based on the initial letter of the word, and convert this to a ranking algorithm. 


6.50. (a) e.g., inv(1432) = 3, des(1432) = 2, maj(1432) = 5; inv(3124) = 2, des(3124) = 1, 
maj(3124) = 1. (b) Gs, inv(z) = Gy maj(v) = 14+ 3x + 5a? + 6x? + 524 + 30° + 2°; 
Gg, des(v) = 14+ 1lz+11a2?+2°. 6.51. For w = 314423313, inv(w) = 16, Des(w) = {1, 4, 7}, 
des(w) = 3, and maj(w) = 12. 6.52. e.g., the path NNENEENE (or 00101101) has area 2 
and major index 9. 6.54. Build w in S by choosing w; € n fori = 1,2,...,k. The generating 
function for the ith choice is [n],, so the product rule for weighted sets gives Gg we(x) = [n]*. 
6.56. (1+ 2)” = [2]?. 6.57. One approach is to use a bijection between multisets and lattice 
paths. 6.58. (a) 3017. 6.59. (a) 139. 6.61. To begin, introduce suitable signs and weights 
on the set of all subsets of n. 6.62. (c) 1+a+22?+3a3+424+52° +62° +627 + 62° +629 + 
Bal 4 dol 4 Bel? 4 Dal8 4 lt + 95: (ce) Lt et? taro t+at+25°4+2° +25. 6.63. (a) [6]2 = 
(1+-2)(1+2-+22)(1—2+22) in Z[2]; (b) [Ge = (w—[1-+éV3]/2)(a—[-1+iV3]/2)(0+1)(0— 
[—1—iV3]/2)(a —[1—iv3]/2) in C[a]. 6.64. (f) The partitions are (0), (1), (2), (1,1), (2,1), 
and (2, 2), so El =1+a+4+22?+23+4 2+. 6.65. Compare to the second proof of 1.46. 6.66. 
(c) For all z € Ti, w3((g 0 f)(z)) = wa(g(f(z)) = we(f(z)) = wi(z). 6.68. (b) 87654321; 
weight is 6 + 6 + 16 = 28. 6.69. (c) (45321, 4321, 111001000); weight is 9 + 6 + 18 = 33. 
6.71. Generalize the second proof of 6.36. 6.72. (b) {fg 22") Fecal = eal (look 


at area in Figure 2.2). 6.74. (a) 1728 + 8642? + 86427? + 288°. 6.75. (d) 10(3+2)°. 6.76. 
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Gr, (x) = 16x3 + 152+ + 62° + 2°. 6.77. fe(341265) = (0,0, 2,2,0,1) with both weights 5; 
g6(0, 0,1, 3, 2,3) = 416532 with both weights 9. 6.79. For w € S,,, define f(w) = (t1,...,tn) 
where t, = |{(i,j) € Inv(w) : w; =n+1—}|. 6.80. £(35261784) = (0,1, 1,0,2,1,3,2) and 
f—1(0,1, 0,3, 2,4,6,5) = 68537142. 6.82. f6(341265) = (0,0,1,1,0,5) with both weights 
7; fg ‘(0,0,1,3,2,3) = 635142 with both weights 9. 6.85. Both sums equal [n]!,,. 6.87. 
Adapt the proof of 6.29. 6.89. (a) If w, = n, the cyclic shift creates a new descent at 
position n — 1 and does not affect the other descents. If w1 = n, the cyclic shift removes 
1 from the descent set. If wy = n where 1 < k < n, the cyclic shift replaces k by k — 1 
in the descent set. (b) Apply the cyclic shift n — & times, and then erase w, = n. 6.90. 
(b) 19,144. 6.91. h3(w) = 245331524515132, so inversions decrease by 6 = ns3(w). 6.92. 
hy!(w) = 425331542511523. 6.93. e.g., g(3124) = 1324 and g(1432) = 4312. 6.94. (a) 
4526173. 6.95. (a) 2415673. 6.96. e.g., g(100110) = 100100. 6.97. (a) Gw, = 1, Gw, = 
z+1, Gw, = Gw,_, + tGw,_, for n > 2, so Gw,(x) = 1+ 6x + 102? + 42°. 6.98. 
Ga = Gn—1,k-1 + Gn—1,x [kl (with appropriate initial conditions). 6.100. 3.47 and 6.54 
may be helpful here. 6.102. Show [];_,(1 + ta’) = pig thak@tD/2 [2]. 6.103. (b) Show 
(i, 7) € Inv(w) iff (w;, wi) € Inv(w—) for all i,7 < n. (c) Ga(a,y) =1+ayt+ ay? +07y + 
xy? + xy. To prove Gn(x, y) = Gn(y, x), use (a). 6.104. (b) Let [(w) = w7! as in 6.103, 
and consider the composition I 0 go Tog7'o I. 6.105. G,(x) = 2""—Y/2C,,(a—!). 6.106. 
(a) C3(q,t) = q + t+ qt? +t? + dt. (d) See [64] or [85]. (ec) This is known to be true, 
using hard results in [48], but at this time (July 2010) no direct bijective proof is known. 
6.107. (a) k~! sends a path P € Dy, to (go,---,9n—1), Where gj is the number of area cells 
to the right of the path in the ith row from the bottom; check that this sequence is in Gp. 
(b) Map g € Gn N R(0" 1"! -- + 8°) to a Dyck path whose bounce path has vertical moves 
U0, VU1,+++5Us- 


7.105. f+g=1—-c-—2 +62+, fo = 2 —327 +22? + 30° — 32° —30°+ 92°, deg(f)=4= 
deg(g) = deg(f + g), deg(fg) = 8, ord(f) = 1 = ord(fg), ord(g) = 0 = ord(f + g). 7.107. 
P,(/5) = 6V5, P;(x) = 2? +42—1= f, Pr(g) = -1+42+2? +403 4 224 +2°. 7.108. (b) 
x+a? +a3/3—2x°/30—2x°/90. 7.109. (a) f, 9, f +g are nonzero, deg(f) = n = deg(g), and 
fn = —Qn- 7-110. (c) 1— 2+ 32?/2— 7273/3 + 1127/3. 7.113. (a) —327-? — 3271 — 4-42 

4a? — 473 —---. 7.114. (a) (sinhx)n = y(n is odd)/n!. 7.115. additive identity axiom: let 
Zn = Ox for alln EN. Fix F € K[[a]]. FornéeN, (F+Z), =F, +Z, =F, +0n = Fp = 
(Z + F), (by additive identity axiom in K),soF+Z=F =Z+4+F. 7.117. (a) Let f=2 
in Z4[z] (the ring Z, is defined in 2.150). (b) Let f = 1+ 22 in Z4[z]; note f~' = f. 7.118. 
Use 7.61 and 7.92. 7.120. (a) Use 2.79. (b) If R is finite, what is | R[x]|? What is |?R|? 7.121. 
(c) Since c = cx® € K [2], the definition 7.17 gives P.(z) = c(0)z° = clr = c. 7.123. (b) 
For all n EN, (cF)'(n) = (n+ 1)(cF)(n +1) = c(n + 1)F(n +1) = c(F"(n)) = (c(F’))(n). 
So (cF)’ = c(F’). 7.125. (b) Fix F,G,H € K|[z]] with G(0) = 0. Use 7.32 to choose 
polynomials f,, gn, hn € K[2] with g,(0) = 0 for all n, fp > Fy gn — G, and h, — H. For 
each n € N, 7.57(c) gives (fnhn) © gn = (fn © Gn) (hn © gn). Now let n — oo and use 7.35 
and 7.61. 7.127. (c) Forn EN, [log(1+z)]’(n) = (n+1) log(1+z)(n+1) = (n+1)(—1)"/(n+ 
1) = (-1)”. Now, >,>9(-1)"2” = (1+.2)7! by 7.41. 7.129. (a) 32 + 2? — 7x3/3 + 22°; 
(d) e? — 1; (g) log(1 + x). 7.130. (a) Both sides have constant term zero; for all n > 0, 
(f F+Gdz)(n) = (F+G)(n-1)/n = F(n-1)/n+G(n-1)/n= (f Fdz)(n)+(f Gdz)(n). 
(f) Set G = f F dz and H =G’; for alln EN, H(n) = G’(n) = (n+ )NG(n4+1) =(n+ 
1)F((n+1)—1)/(n+1) = F(n). For f F’ dx, compute the constant term separately. 7.131. 
Integration by parts rule: for all F,G € K|[a]], [ F(G’)d« = FG — F(0)G(0) — f F’Gdz 
(prove it by integrating 7.54(e) and using 7.130(a),(f)). 7.133. (a) Coefficient of x” is 
zero for n odd, and Yy;_9 (;)(—1)* = x(n = 0) for n even. 7.136. Use induction on k. 
7.1389. F=1-2-a7 +a? +a? —c?—7h 44% +---. (We will study this infinite 
product in Chapter 8.) 7.140. For existence of the infinite products, use 7.41 and 7.33. 
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To verify the equalities, look at a fixed coefficient and reduce to finite products as in 7.42. 
7.142. 592" = (1-2). 7.144. (b) JI 4+¢ = 3)((1/2) ln /n)e” = 14+ 2/2- 
a? /8 + 03/16 — 504/128 + ---. 7.145. (a) 1+ 2/2 + 11x?/8 — 11”3/16 + ---. 7.146. (a) 
Ltota22/2—x1/8+---. 7.147. F =3/(1 +20) +7/(1— 42), so Fy = 3+ (—2)"+7-4". 
7.149. F = x —3/(1— 2)? + (3/2)/(1 — 2), so Fy = x(n = 1) — 38n — 3/2. 7.151. (b) 
F = 5~,, Gnx” satisfies (1 — 3z)F = 2 + (3a/(1 — z)”), so a, = (17/4)3”" — (3/2)n — 9/4. 
7.152. (a) a, = 4" — 2"; (c) F = 0, ane” satisfies F(1 — 6x + 827) = (1 — 2x)! — 1, so 
An, = 2-4" —(n+2)-2”. 7.154. 0) anv” = (a@+52?+723 +432") /(14+32—22?—-623+24432°) = 
(1—a)~?—(1—2)~', so ay, = n for all n > 0. 7.157. Treat the case L = 2 separately. 7.159. 
(a) secx = 1+ (1/2)a? + (5/24)a4 + (61/720)2 + (277/8064)x8 +---. (b) Use 7.133(a). (c) 
Use 7.128. (e) Work in the field K ((a)). 7.160. (a) One approach is to show both sides satisfy 
G"+4G = 0, Go = 0, Gi = 2. (d) To start, note exp(iz) = 0,59 i”2”"/n!, where i?* = (—1)* 
and i2*+! — 4(—-1)*. 7.162. (b) Reduce to the case x = 1 and use the idea in 7.41. (c) If 
x” =0r = y™, use the commutative binomial theorem to simplify (2 + y)"*™~!. 7.163. 
Use 7.162. 7.164. (a) x+23/6+ 30° /40+5a7/112+---; (b) e—23/34+2°/5—27/7+---. 
7.165. For F = (1—ra)~!, FY = r*k!(1—ra)-*-1, so Maclaurin’s formula gives Fj, = r* 
for all k. 7.168. (b) nl(F*G")n = n! Vp_g FRG*_, = n! peo (Fe /k!)(Gn—-e/(n — k)!) = 
Waco (7) PeGn—z. 7-169. (a) 0, na” = 1(1—2)~1—3(1—2)-? +. 2(1—2)-3. (b) Multiply 
the series in (a) by (l—x)~!. 7.170. (b) (n(n+1)/2)?. 7.173. (a) $(m) = aS(m—1)+c(b*)™ 
for m > 1; S(0) =d. (b) Use partial fractions to solve $(1 — ar) = (d—) + c(1— b¥x)7!. 
Treat the case a = b* separately. The identity n!0% * = q!°%” can be useful here. 7.174. 
Argue that T(1) = d and T(n) = 2T(n/2) + cn for n > 2 (where c,d are constants). Now 
apply 7.173 with a = b = 2 and k = 1. 7.175. Use 5.90(a) and 7.173, and compare the time 
complexity to 5.90(b). 7.176. Write G = exp({ P dz), which is well-defined with constant 
term 1. The solution is F = (c+ [ QGdx)G~!. 7.179. (a) Transitivity of ~: fix (a,b), (c,d), 
(e, f) in X with (a,b) ~ (c,d) and (c,d) ~ (e, f), so ad = be and cf = de. We must show 
(a,b) ~ (e, f), ie., af = be. Now af (cd) = adcf = bcde = be(cd). If c= 0, then a =e = 0 
since d #0 4 f, so af = be. Otherwise cd 4 0, so cancellation of cd gives af = be. (d) 
Injectivity of i: say a,b € D and i(a) = i(b). Then a/1 = b/1, so al = 1b anda =b. 


8.36. (a) (1 — kr)-!. 8.38. 2*/(1 — x)*. 8.39. (a) (26,16,10,2,1). 8.40. (b) 
(21,13,7,11°). 8.41. (a) (9,5,3,19). 8.42. (a) (15,12,10,8,6,3,1); (d) (72). 8.45. (a) 
(18, 17, 16, 15, 13, 10, 8,7). 8.46. (b) [[72,(1 + 2”). 8.47. (a) coefficient of 2” counts in- 
teger partitions of n where all parts are divisible by 5. 8.48. (a) The identity 1+3+5+ 
+++ (2k — 1) =k? can be useful. 8.50. The result holds for the tree with one vertex. For 
t = (e,t,,t2), we can assume by induction that the result holds for t; and tg. If these trees 
have a; and ag leaves, then t has a; + a2 leaves and (a; — 1) + (ag —1) +1 =a, +a.-1 
non-leaves, so the result holds for t. 8.53. (a) Recursively define g~1(0) = (¢,0,0) 
and (for k > 0) g (ku... ux) = (0,t1,t2), where t) = g t((k — 1)u1... up_1) 
and tg = g '(ux). 8.56. With the notation used in 8.24, A has generating function 
[hoi + 2? t+ 2% +---4+ 26-04) = J]. (1 — 2%) /(1 — 2"). Argue carefully that this 
equals [],. docs not divide «(2 — v’)’, which is the generating function for B. 8.58. (a) Area 
is otk — RL k = (2n)(2n + 1)/2 — n(n + 1)/2 = (3n? + n)/2. 8.61. (a) 38. 8.62. 
(a) Replace each 2) by (1 + tC), 8.63. (a) (1 — V1 —4a@)/22 = S05) Cra”. 8.65. 
Terms of length 4 are 3000, 2100, 2010, 1200, 1110. So the coefficient of x4 in the inverse is 
RéR3+3R2R, Ro+RoR}, which is —45 in (c). 8.66. (c) x/(1—az). 8.67. (a) Gg = 14+32Gzg, 
so Gg = (1-32)! = YO 4, 3"2". (c) Gg is not defined, because S$ with this weight is 
not admissible. 8.69. (a) Gs = 1+ 2G. 8.70. (b) Gs = (1— x — V1 — 2x — 327) /2z. 
8.72. Argue that J], ag(1 — 2°)" TT even(1 + 2”) = Tgp (1 + o* + 2?* + o*). 8.73. Use 
a bijection. 8.74. (a) [];5 (1 + 2t"). (b) Given a self-conjugate \, consider the largest k 
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such that k < A,. 8.77. Extract the coefficient of «” in n! >. 9(e” — 1)™/m!. 8.79. (a) 
The left side counts pairs (A, j:) € Par x DisPar weighted by «+/+! and signed by (—1)". 
If Ay > 41 or ps is empty, obtain I(A, 1) by dropping the first part of A and adding it as the 
new first part of yw. If Ay < wi or A is empty, obtain I(A, 4) by dropping the first part of wu 
and adding it as the new first part of A. The only fixed point is (0,0), giving the 1 on the 
right side. 8.81. (b) Coefficient of x” is )0, Ruo(x) Ryn) ++: Rw, _1(7) Where we sum over 
Dyck paths 7 ending at (n — 1,n—1), and N,(7) is the number of north steps of 7 on the 
line « = 7. 8.83. e~7/(1— 2). 8.84. exp(t(e” — 1—x)). 8.87. This result is due to Bressoud 
and Zeilberger [20]. 8.88. (a) A bijective proof can be given using 4.126; see [109]. 


9.148. (b) Yes, every e € X works, but e is not unique. (d) No. 9.149. Closure: given 
x,y € G, write x = 2k +1 and y = 2m+1 for some k,m € Z. Thenzxy=aut+yt+5= 
2(k +m-+3)+1€ G. Associativity: for all z,y,z € G, (uxy)*z = (e@+y+5)*z= 
xe+tytz+10 and ax(y*xz) =ax(yt24+5) =a2+y4+24 10. Commutativity: for all 
uye€ Gyorxy=utyt5=y+ae4+5 = yx. Identity: fore = —-5 € GandzveéeG, 
cxe = a4+(—5) +5 = 2 = ex. Inverses: given x € G, let y = —x — 10 € G; then 
rxy=x+(-x-10)+5=-5=e =yxz, so y is the inverse of x in G. 9.151. z? =e 
implies z = 27! for all z € G. Use this in 9.8(d). 9.154. The definition of @ shows that 
(x@y)@z = x+y+z—cn for some c € {0, 1, 2}, and also (x@y)@z € Z,,. The second condition 
shows that c is given by the cases in the middle of (9.1). Similar analysis works for x@(y@z). 
9.156. (b) [x, y|C,([z, z]) = (wyx'y~!)y(azx—!z~1)y7!. Regroup and cancel y~'y and 
then 2~1z to get x(yz)a—1(yz)~! = [z, yz]. 9.157. To prove x™*”" = xz” for m,n > 0, fix 
m € N. When n = 0, both sides are 7. Assume n > 0 and 2”t” = 2™a” is known; then 
gmtintl) — g(mtn)+l — gmtny — (g™e")e = o™(a%2) = 2 ™2"t!, Proceed similarly when 
m<0Oorn <0. 9.158. (d) Fix g,h € G with o(g) = o(h). Then g = eg = R,(e) = Ra(e) 

eh =h, so ¢ is one-to-one. 9.159. (a) For existence, choose 2 = a~'b € G. For uniqueness, 
use left cancellation. 9.161. (a) f = (1,3,7,8,6,4,5)(2); (b) fog = [5,1,3,7, 2,4, 6, 8]. 
9.163. (a) gfg—! = (5, 2,6)(1, 7)(3)(4, 8); (b) sgn(g) = (—1)8-3 = —1; (c) h is not unique, 
but we must have h(4) = 6. 9.164. lem(j1,..., wx). 9-167. (a) Both sides are functions from 
X to X that send 7; to i;41 for 1 < 7 < k, send 7% to 21, and fix all other x € X. 9.168. 
Given f = fi-:-fi---fj--+ fa, f° (9) = fi---fj---+fi-+- fn is obtained by switching 
the symbols in positions i and j; (7,7) 0 f is obtained by switching the symbols i and j in 
fi-++ fn. 9.170. Reduce to the case where f is a product of two basic transpositions. 9.171. 
Argue that the only permutations w € S;, giving nonzero terms in 9.37 map {1,2,...,k} to 
itself and {k+1,...,n} to itself. 9.172. (a) n! — 1 additions and (n — 1)n! multiplications 
(assuming multiplication by sgn(w) is free). (c) Use Gaussian elimination, keeping track 
of the effect of each elementary operation on det(A). 9.174. Use the same formulas, with 
no signs. 9.175. Define b; = T(Opr,...,1r,...,0R) € R where the 1p is in position j. 
Show that R-linearity forces T(v1,...,Un) = 2, deve for all vz € R. 9.177. Ax = b forces 
a = A~*b. The adjoint formula for Aq’ gives x; = det(A)~* >\/_,(—1)'7 det(A[j|¢])b;. 
The sum here is the Laplace expansion for det(A;) along column i. 9.178. det(AB) = 
—80 = (—2)-4+417- (—12) + (—12) - (—8) + (—9) - (4). 9.179. Adapt the argument 
in 9.29. 9.182. First, eg = x° € (x). Second, given y,z € (x), write y = 2” and z = 2” 
for some m,n € Z. Then yz = v2" = x” € (x) since m+n € Z. Third, given 
y=a™ € (xr), yt =ax-™ € (x) since —m € Z. 9.184. Imitate the proof of 9.59. 9.185. 
(c) Take G = 53, S = ((1,2)), T = ((2,3)); note ST = {eg,, (1, 2), (2,3), (1, 2,3)} is not a 
subgroup since (1,2,3)~! ¢ ST. 9.186. Consider the surjection f : S x T — ST defined 
by f(s,t) = st. For each fixed element z = st € ST, show that f(u,v) = z iffu = sw 
and v = w~'t for some w € SMT. 9.188. Assume H < G is not normal in G; then there 
exist g € G, h € H with ghg~! ¢ H. Since the unique conjugacy class of G containing h 
intersects both H and G ~ H, H cannot be a union of conjugacy classes. The converse is 
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similar. 9.190. The sizes of the conjugacy classes of S'; are 1, 10, 15, 20, 20, 24, and 30. A 
normal subgroup of S; must be a union of conjugacy classes that includes {eg,} and whose 
size divides 120. Check that the only possibilities are {e}, As, and S5. 9.191. For each 
x € H, consider the sequence (z,27,z°,...). 9.193. (a) {[1,2,...,n], [n,n —1,...,2,1]}; 
(b) Sp; (c) Sn; (d) ((1,2,...,n)). 9.195. (a) Q, has 2* vertices and deg(Qz) = k, so 
|E(Qx)| = k2*-1 by 3.34. (c) One can obtain 2*k! automorphisms by permuting the k 
positions and switching or not switching 0 and 1 in each position. Part (b) can help prove 
these are all of the automorphisms. 9.197. To state the answer, consider the sizes of the 
equivalence classes of the graph isomorphism equivalence relation on the set {C1,...,C,}. 
9.198. (b) Let L < H. Since f(eg) = en € L, eg € fo {L]. Fix x,y € fo'[L]. Then 
f(x) €L, f(y) € L, s0 f(xy) = f(x) f(y) € L and xy € f-[L]. Also f(a!) = f(@) € L, 
so a! € f~1{Z]. If L is normal in H, then f~*[Z] is normal in G. For if « € f71[L] 
and g € G, grg! € f—"[L] because f(gxg—!) = f(g) f(x) f(g)~1 € L. Taking L = {ey} 
shows ker(f) < G. 9.199. Show that the map z + (|z|,z/|z|) is a group isomorphism. 
9.201. (b) If HI Gand K IG and HK = {ec}, first prove that hk = kh for all 
h € H andk € K. 9.202. (b) Aut(Z) = {idz,N}, where N(k) = —k for all k € Z. 
9.204. (c) For k > 0, explain why the least m € N+ with (x*)™ = eg must satisfy 
km = lem(k,n). (d) If f is only a homomorphism, show the order of f(a) divides the 
order of x (when x has finite order). 9.205. (a) Fix 71,22, y1, yo € G with 71H = 22H and 
yi = yoH. We know Lo £1 € H and Yo Yi € H and must show 21y,H = xoyoH. For this, 
note (xaye)'(@iy1) = yo! @ e1yr = (ya y1)(y, [zz 'x1]y1). The second parenthesized 
expression is in H by normality, so (x2y2)~+(a1y1) € H by closure. When checking the 
group axioms, note eH = H is the identity of G/H, and rH has inverse «~!H for x € G. 
9.207. First show f is well-defined: i.e., for all x,y € G, xK = yK implies f(aK) = f(yK). 
9.209. To obtain f, apply the fundamental homomorphism theorem 9.207 to a suitable map. 
9.210. Consider f : C/A — C/B given by f(aA) = xB for x € C. Check f is a well-defined 
group homomorphism with image C’/B and kernel B/A, and apply 9.207. 9.211. 9.198 and 
9.210 can be useful here. 9.218. Closure: for g,x € G, gxx = gug—! € G by the closure axiom 
for the group G. Identity: for ¢ € G,e* x = ere~' = xe = @. Associativity: for g,h,x € G, 
gx*(h*xax) = g(hah—')g—' = (gh)a(gh)~! = (gh) xx. 9.214. Associativity: for k,m € K and 
x € X, ke(mex) = ke(f(m)*x) = f(k)*(f(m)*x) = (f(k) f(m))+x = f(km) xx = (km)er. 
9.216. Use the fact that two linear maps from V to V are equal if they agree on a basis of 
V. 9.218. Hx2 ={xh~!:h € H}. Since H is a group, h~! ranges over H as h ranges over 
H. So H*a = {ak:k € H} = xH. 9.220. One approach is to note that (),,-. Stab(z) 
is the kernel of the permutation representation associated to the given action. 9.221. (b) 
Given g € G, let C, be the inner automorphism such that C,(2) = gvg7! for x € G, 
and let T : G > G be any automorphism. Verify that ToC, 0 T7! = Cr(g)- 9-222. For 
(a,b) # (0,0), the orbit of (a,b) is the circle with equation x? + y? = a? + b?, and the 
stabilizer of (a,b) is (27). 9.224. Assume z € GaN Gy, soz = g*x = hx y for some 
g,h € G. For any w € Gz, w = k* x for some k € G. Check that w = (kg~'h) * y € Gy, 
so Ga C Gy. One proves Gy C Gz similarly. 9.227. Fix z,y © G with eH = yH. Then 
y ‘a € H (by left coset equality theorem), so y~'(x~!)~! € H, so Ha~! = Hy™! (by right 
coset equality theorem), so T(#H) = T(yH). This shows T is well-defined. Reversing the 
steps shows T is one-to-one. Since Hy = T(y~'H) for y € G, T is onto. 9.232. z@) = 6, 
25,1) = 5, 22,22) = 48 = %2,1,1,1,1), etc. 9.233. |Cs,(g)| = 24; conjugacy class has size 
1680. 9.235. Z7(S2) = So, Z(S,) = {es, } for all n 4 2. 9.237. (c) Conjugacy classes are {e} 
(size 1); all 3-cycles (size 20); permutations of type (2,2,1) (size 15); class of (1, 2,3, 4,5) 
(size 12); class of (2,1,3,4,5) (size 12). 9.238. Study the proof of 9.136. 9.239. (a) 5. 
9.241. G is the disjoint union of its conjugacy classes. The term |Z(G)| counts conjugacy 
classes of size 1, and the sum counts the remaining conjugacy classes (by 9.130). 9.242. 
Apply 9.241. 9.243. Let H = ((1,2,...,p)) < 5p, and study the H-set S,/H. 9.244. Treat 
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odd n and even n separately. 9.246. For even n, the answer is (k” + k(n/2))/2. 9.247. (b) 
(q° + 6q + 2g? + 9q°)/18. 9.248. (74 +11-77)/12 = 245. 9.250. (q° + 3q* + 12g? + 8g”) /24. 
9.252. (3) = 10. 9.254. (a) The symmetry group is Ay, so the answer is the coefficient 
of ee in (P(1,1,1,1) + 8p(3,1) + 3p(2,2))/12, which is 1. (b) 2. 9.256. (a) 3; (b) 8. 9.258. 
32, 885, 748, 000 (use inclusion-exclusion). 


10.135. Horizontal strips are (6)/(3), (5,1)/(3), (5,1)/(2,1), (4,2)/(3), (4,2)/(2,1), 
(4,1, 1)/(2,1), (4,1,1)/(1,1,1), (3,3)/(), (3,2,1)/(2,1), and (3,1,1,1)/(1,1,1). 10.137. 
2 


and 
p/v is a horizontal strip iff p; Vy; > wisi for all i > 1. 10.140. (b) and 


Sit (c) There are sixteen tableaux. 10.142. Informally, conjugate a tableau of shape 


pu/v to get a tableau of shape p’/v’. Formally, define F : SYT(u/v) — SYT(p’/v’) b 
F(T) = ((4,9) — Tj, 2): G9) € p’/v’) for T € SYT(u/v). Check F is a bijection. 10.144. 
For N = 2: 2123 + a7ax9. For N = 3: 2123 + wer + 2103 + x3x3 + rox? + 7323 + 110273. 


10.145. (a) 16; (b) the tableaux are and , so the coefficient is 2; (c) zero, 
3 


since the first column cannot strictly increase. 10.147. N < max;>1(u) — vj). 10.148. (b) 
U1T9X%3 + LX{XQX4 + L{XQX5 + T1{X3X4 + U1 XZX5 + U1 XM4X5 + LQU3X4 + LoU%3LX5 + LoX4X5 + 
zr4@5. 10.149. (a) a} + 23 + 23 + D0,,, ia7. 10.150. (a) ({). 10.153. One answer is 
(9,9, 9, 8, 4, 2, 2,2, 2,2,1)/(8,8,8,4,2,1,1,1,1,1). 10.156. {m,,(v1, 22, v3) : w € Par3(7)} = 
imc) ee 1)> 75,2)» 74,3) > 7(5,1,1)> 7 (4,2,1) 5 7(3,3,1)> (3,2, 2)}- 10.157. e.g., for k = 6 and 
- O12 ...,6, the dimensions are 1, 4, 7, 9, 10, 11 = p(6). 10.159. (a) 11. 10.160. fs 
maps fhe taiblenil to the tableau with rows 1,1,1,1,2,3; 2,3,3,3,4, 4,4; 1,3, 3,4, 4, 5, 6, 6; 
4,4,5,5,5, 7,9; 5,6, 7, 7,8. 10.162. e.g., 8(3,1) = ™3,1) + ™Ma,2) + 2m(2,1,1) + 3™(1,1,1,1) and 
§(2,1,1,1) = ™2,1,1,1) + 4m45). 10.164. e.g., h2,1) = §(3,1)/(1) = ™3) + 2m(2,1) + 3M(1,1,1)- 
10.165. For the program, 10.137(a) may be useful. 10.166. First check that <jex is reflexive, 
symmetric, and transitive on Par(k). Given u,v € Par(k) with p ¥ v, the leftmost nonzero 
ee in uw — v is either positive or negative, so either v <jex fs Or fe <iex V. 10.168. (b) 
= (2,2,2,1), vy = (3,1,1,1,1); uw = (3,2,2), v = (4,1,1,1); p = (3,3,1), v = (4,1,1,1); 

= (4,3), v = (5,1,1). 10.169. (a) (5,4, 2,1,1)R(6, 4, 2, RU? 3,2, 1). 10.170. The state- 
a, fails; e.g., uw = (2,2,2,1) <tex (3,1,1,1,1) = v, but v’ = (5,1,1) Liex (4,3) = pW’. 
10.172. To prove well-ordering, recall that N is well-ordered. So the first components 
of a must stabilize for 7 > j,;. Then the second components must eventually stabi- 
lize, and so on. 10.173. For the statement about deg(gh), it may help to show first that 


1 0 0 0 0 
3 1 0 0 
B <tex a implies B+ <iex a +7 for all a, B,y € NY. 10.174. (a) K=] 2 1 1°00 
3 2 1 1 0 
1d, DT dD 
1 0 (0) 0 O 
-3 1 0 0 O 
(b) K-1 = 1 —1 1 0 0 |, SO, e.g., ™m2,2) = §(14) — §(2,1,1) + S(2,2)- 10.177. 
2 -1 -l 1 0 
-1 I 0 -1 1 


Use 9.51. 10.178. First show (by induction) that for all 7 € I, there are scalars b,; € K 
with b;; 4 0 and vu; = =); <j Dig Wj- 10.179. For 1 = 4, T ~— 4 has rows 1,1, 2,3, 4, 4,4; 
2,4, 5, 6,6, 6; 3,5, 7,8; 4,6. 10.180. For i = 2, T — 2 has rows 2, 2,2,5,5, 7,7; 3,3, 3,6, 7, 8; 
4,4,5,8,8; 5,6,6,9; 6,8,8; 7; 8. 10.182. (b) Shift the entire first column down one 
row, and put x in the 1,1-position. 10.183. Starting at the corner box in row 3, re- 
verse insertion produces JT; with rows 1,1,2,3,4,6,6; 2,4,5,6,8; 3,5, 7; 4,6; and x; = 4. 
10.184. Starting at the corner box in row 5, reverse insertion produces the tableau 
with rows 2,2,4,5,5,7,7; 3,3,5,6,7,8; 4,5,6,8,8; 6,6,8,9; 7,8; 8; and the output value 
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rows 1,1,1,2,2,3,5,5; 2,2,3,4,4,6; 3,4,5,6,6; 4,5,7,8; 6. New boxes are (5,1), (4,3), 
(4,4), (3, ne (2,6), (1,8) in this order. 10.191. Final tableau has rows 1,2, 2,3,5,5, 7, 7; 

, 7; 3,4, 5,6, 8,8; 4,6,6,7; 5,8,8,8; 6,9; 7; 8. New boxes are (1,8), (3,6), (5,4), 
(6,2), 7,1), (8,1) in this order. 10.193. e.g., v = (5,5,5,5,1) cannot be reached; in- 
sertion of 3,2,1 gives the shape v = (5,5,4,4,2 1), 10.195. S has rows 2,3, 4,5, 7, 7,8; 
3,5, 5, 6,8; 4,6, 6,8; 6,8,8,9; 7; 8; and 21222324 = 2357. 10.197. (a) 85.41) + 8(5,3,2) + 
8(5,3,1,1) + $(4,4,2) + $(4,4,1,1) + $(4,3,2,1) + $(4,3,1,1,1)- 10.199. (b) 5/8) + 8(7,1) + 86,2) + 8(5,3)- 
(d) First explain why 5(6,3,2,2)/(3,2) = 8(2,2)hi hs. 10.200. (a) 1; (b) 7. 10.201. e.g., h2,2) = 
6m1,1,1,1) $4M2,1,1) F3M 2,2) +2M3,1) FM 4). 10.203. (c) $(5,3) +8 (4,4) +38 (4,3,1) +$(5,2,1) + 
8(4,2,2) + 28(4,2,1,1) + §(3,3,2) + 28(3,3,1,1) + §(3,2,2,1) + $(3,2,1,1,1)- 10.204. (a) 1; (b) 0. 10.205. 
€(2,2,1) = Mm 3,2) +2m(3,1,1) +5™M(2,2,1) 12m (2,111) + 380M 15). 10.207. Count matrices A with 
entries in {0,1} satisfying }7, A(i, 7) = a; for alli and 97; A(i, j) = Aj for all j. 10.208. (a) 
2hi (x1, £2)ho(x1, 2) — hi(x1, x2)? — h3(x1, 2) = 0. 10.211. Use 10.172 to prove termina- 
tion. 10.212. For mi2,1), a = (2,1,0,0) initially, so consider mz,1) — e1€2. This is —3e3, so 
my2,1) = €1€2 — 8€3. 10.213. J is —(a1 — x2)(%1 — 13)(%2 — 3) A 0. 10.215. The first object 
maps to (3, 345, 11223). 10.217. Interchange the symbols h;, and ex (for all k) everywhere in 
the proof of 10.88. The recursion (10.7) is unchanged by this, so the proof works. 10.218. (b) 
he (21, es ,tN) = he (a1, er ,Un—1) tan hp_-1(21, Sets ,0N). 10.219. (a) Use 2.75. 10.221. (a) 
2; (b) 4; (e) compare to 10.212. 10.222. h4(2, 3,5) = 2261. 10.224. Adapt the proof of 10.88. 
10.226. f sends the first object to [2]4[4]4*[4]5[5| 10.227. I sends the first object to 
(2, 14, 3333). 10.228. >.) Pngi(ai,--., en)”. 10.229. For F = [].,(1 — ait), what is 


(dF/dt)/F? 10.232. afer) = (22,10, 002,506,890), ( § ; ; ; i : y al) 


10.234. iA) has w = 32754681 and T = 24444666. 10.235. (b) Check A* has eigenval- 
ues rf’,...,7r*, hence tr(A*) = pz(ri,...,7n). (c) Use 10.98 and 10.223. 10.236. (a) es; 
(b) —pr3,2,1,1)- 10.237. Use 7.102 and 10.86; h, maps to (—1)"en. 10.239. e.g., applying 


w to 10.212 gives fgti24) = Aihe — 3hg = —2m(g) — Mea1). 10.241. Let T = Fy and 


is 3. 10.187. ae, 4,3,1,1) + $(4,4,4,1,1) + $(4,4,3,2,1) + $(4,4,3,1,1,1): 10.188. Final tableau has 


—— | Then RSK~1(T,T) = 3412, RSK~1(T,U) = 3142, RSK~1(U,T) = 2413, and 
RSK™ Har U) = 2143. 10.242. P(w) and Q(w~') have rows 1,2,3,8; 4,5,6; 7. Q(w) and 
P(w7') have rows 1,3,4,6; 2,5,7; 8. 10.243. w = 57218463 and v = 43861725 = wt. 
10.247. (a) RSK is a bijection from 5, to Uy,,, SYT(A) x SYT(A). Now take cardinalities 
of each set and use the sum and product rules. 10.250. P(w) has rows 1,1,1,1,3; 2,2; 
3,3; and Q(w) has rows 1,3,6,7,8; 2,4; 5,9. Des(w) = {1,3,4,8} = Des(Q). 10.252. (a) 
[4], = t+ +23 + t*. 10.253. pas) = $14) + 38(2,1,1) + 28(2,2) + 38(3,1) + Sia). 10.255. 
0 1 0 1 
(a) Matrix is | * 9 5 9 |; P has rows 1,1,2,3,3; 2,3; 4; Q has rows 1,1,2,3,3; 2,2; 
0 1 0 0 
4. (b) Transpose the matrix in (a), and switch P and Q. 10.257. For a solution, see [46, 
§4.2]. 10.260. Show A is invertible, and then show B = A~! (where A~! denotes the unique 
two-sided inverse of A). 10.261. We know g = )°,, a, f, for unique scalars a, € K. Take the 
scalar product of both sides with f,,. 10.264. Define a bijection mapping a semistandard 
tableau to a pair consisting of a standard tableau U and an object whose weight is one of 


the monomials in Q,, Des(u)- 


11.73. wt(w) = 27 and J(w) = (2,(6,6,4 »3, 2, 2,2,1,1)). 11.74. eg., U(—1,(3,2)) = 
--110010100---. 11.76. First ask how the frontier of py’ is related to the frouties of w. 
11.77. Both w a J(w) have weight u°q'’. 11.80. [],5,(1- 2") Smez(-1)™ gm? +m) /2 


(take q = a° and u = —2~? in 11.5). 11.83. (b) (¢‘7'g):16-21-14. 11.84. For k = 3, the 3-core 
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is 0, v° = (2,2), vt = (3,1), and v? = (2,1,1,1). 11.85. The 2-cores are the “staircases” 
(n,n—1,...,3,2,1,0) for all n > 0. 11.88. For k = 5, the 5-core is (2,2), v9 = v' = (1,1), 
=(2, 2). oF Sp = (2) 21,89: For ki S(t (9) 2 end O =(0), 
11.93. 4(6,3,1) (1, 2,23) = +x$a3ck+a3cha§ + cja$r3 —afe$arh —aSrdr3 —xharsc$. 11.95. 
K-linearity of the map T(f) = fascy): for f,g € Aw andce K, T(f +g) = (f+ 9)asvw) = 
fas) + gascny = T(f) + T(g) and T(cf) = (cf)asiwy = c(fasyy) = cT'(f). 11.96. (b) 
For f € Aw, 9 € An, and w € Sw, we (fg) = (we fiw eg) = flean(w)g) = sen(w)(F9), 
so fg € An. 11.98. wt(v) = a323?xfat7}2}3, w(v) = 625314, pos(v) = (13, 12, 10, 7, 3, 2), 
and sgn(v) = +1. 11.100. (c ya @(3,1,1,1,1,0)-+6(6) — 9(2,2,11,1,0)+5(6 ): For N > 7, add the 


term —4(17)45(N ): 11.101. I(v,3) = (0410030206500 - - ,4). 11.102. (b) a eee, ee + 


@(4,3,3,1,0,0)+65(6 ) + @(3,3,3,1,1,0)+6(6)- 11.103. T(v, {3, 4, 6}) = (0310040205600 - - , {3,5 , 6}). 
11.104. (a) Ycay,s5) for X = 721,631,622, 6211, 541,532, 5311, 5221, 442, 4411, 4321. 
11.105. For M = [1,1,4,5], [(v,M) = (v,M) with v* = 0300104206050---; horizontal 
strip is (6, 5,4, 4, 3, 1)/(5, 5,4, 3, 1, 1). 11.109. (a) §(6,3,2) — $(5,4,2) — §(3,3,2,2,1) + 8(3,3,2,1,1,1)- 
11.110. (a) 1 3 ; (b) 0. 11.112. Ps) /45 = P(3,1,1,1)/9 + P(5,1)/5 = P(3,3)/9- 11.113. e. g., 
P11) = ae + §(3,1) — §(2,1, y — sa4) and s22) = pisy/12 — pg3j)/3 + Pa,2)/4. 
11.114. (a) I(v,T) = (5431200--- , 7") where T’ has rows 2,2, 1,1, 1; 1,3,4; 3,5. 11.117. 

pay /4— pana + P(2,2,1)/8 harass + 5p(11,1,1,1)/24. 11.120. (a) hys,3) — he@,2)- 
11.121. (b) €(3,2,1) + €(5,1) — €(3,3) — €4,1,1)- 11.122. (a) h3,2,2) +hys,1,1) — 2h(4,2,1)- 11.124. 
Think of the x-coordinate as the position on an abacus and the y-coordinate as time. 
11.125. For partitions of 4, see the solution to 10.174(b). 11.126. Generalize the proof 
of 11.64. 11.127. Sce [33]. 11.128. See [87]. 11.129. (a) I(v°,T) = (5423100--- ,T’), 
where T” is the skew tableau with word w(T’) = 22513244322111. 11.130. (a) 2. 11.132. 
8(5,4,1) + 8(5,3,2) + 9(5,3,1,1) + 2(4,4,2) + 8(4,4,1,1) + 9(4,3,3) + 9(4,3,2,1)- 11.133. (b) 8(5,2) + 8(4,3). 
11.135. (a) 4. 


12.90. (b) One equivalence class is {NENENE, ENENEN}. There are three other equiva- 
lence classes, each of size 6. 12.92. Label each point (x,y) by the integer « — k — my and 
thereby divide the “bad” paths into m classes. Reflections do not work for m > 1, so look 
for other symmetries. For a solution, see [86]. 12.94. (b) ENNNEEEENNNENEENEEE- 
NENNN. 12.95. (a) NENENNNENENNENNEEENEEENE. 12.96. Check that reflection 
across the line y = x gives a bijection between the event {7 : X;(7) = 0} and the event 
{a : Xj(r) = 1}. 12.97. P(X, +--+ + Xn = k) = ({)/2”, which is not a uniform distri- 
bution. 12.98. 2.30 may be helpful. 12.99. (a) 1+ 6x + 72? + x3; (c) 1+nz. 12.102. (a) 
The multiset [uj +i:1<i< nj is [p+1,n+2,...,2n]. The multiset [4 +i:1<i< nj 
s [2n,2n —1,...,n + 1]. These multisets are equal, so 12.10 applies. (b) One approach 
is to find a recursion satisfied by both r;,(j) and rz(v). 12.105. For a solution, see [88]. 
12.106. The first labeled path in row 2 maps to the parking function given by f(1) = 3, 
f(2) = f(8) = 1. The associated tree has edges {0, 2}, {0,3}, and {3, 1}. 12.107. (c) Labeled 
path is NENNNNEEEENENNENENEE with labels 9, 2, 3, 4, 6, 7, 5, 10, 8, 1 (from coe) 
Parking function is f(1) = 9, f(2) = f(3) = f(4) = f(6) = 2, f(5) = f(10) = 7, f(7) = 

f(8) = 8, f(9) = 1. 12.109. (a) (,.°..,). 12-110. Let m be the spot where car n ae 
12.114. Translate the proof of 12.15 into group theory. 12.115. 2. 12.116. The statement 
is false, as shown by the groups Zz x Ze or S3. 12.119. (b) Check that {+1, +i, +7,+k} 
is a finite subgroup of H* that is neither cyclic nor commutative. (c) For all (b, c,d) € R8 
with b? + c? + d? = 1, use the distributive law to check that (bi + cj + dk)? = —1 in 
H. 12.121. (9? + 91? — 94 — 9°) /12. 12.122. (a) Those of degree at most 4 are z, x +1, 
eteti et t+eti, etae2gl, cette tl, ctte3 tl, ett eft? +241. (b) 
For 1 < n < 7, the answers are 2, 1, 2, 3, 6, 9, 18. 12.123. To prove that multiplicative 
inverses exist in K, suppose f 4 0 is in K. By irreducibility of h, gcd(f,h) = 1, so that 
af + bh = 1 for some a,b € F[az]. Check that a mod h, the remainder when a is divided by 
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h, is a two-sided inverse for f in K. 12.125. (a) For y 4 0, use Lagrange’s theorem on K™. 
(c) 26 —g = e(241)(2? +04 1)(24* +2 4+1)(2* +29 41)(* +23 +272 +241). 12.126. 
(a) Note I(n,q) > (q?-—(l+q+@+---+9""'))/n > 0. (b) Use (a) and 12.123. 12.127. 
(a) d(q” — 1)/n; (b) x and at +23 +2?+a+1 are the two examples of smallest degree. 
12.128. (a) q’"+/?, 12.130. Build such a matrix one row at a time. Each row must be 
chosen outside the span of the preceding rows. 12.131. [3 ie 140, 050. 12.133. RREF ba- 
sis is (1,4,0,4,0), (0,0,1,2,0), (0,0,0,0, 1). 12.136. (a) Set up a bijection between certain 
subspaces of V and certain subspaces of the dual space V*. (b) Use 2.79. 12.139. The coefhi- 
cients give the probability that a randomly selected permutation is an up-down permutation. 
12.141. (b) n = 0: the empty permutation; n = 2: 12; n = 4: 1324, 1423, 2413, 2314, 3412. 
12.143. (b) s9 = 52 =1, sg =q+2q? +93 +44. 12.144. wt(t) = cizse2riz5z6, inv(t) = 8, 
sgn(t) = +1, t is not transitive since (2,3), (3,4), (4,2) € t. 12.147. Write A = ay). 
Note det(A) = [],-;(vi — 2j) #0 since the 2;’s are distinct, so A is invertible. Now write 
p= eae cx’ where c; € F, and let v = (en,...,¢1, C0). The hypothesis p(x,) = 0 for all 
k means that vA = 0, hence v = 0 since A is invertible. So p = 0 in Fla]. 12.149. (b) 462; 
(ec) (*F)71). 12.150. Classify T € SYT(A) based on the location of |A| in T, and use the 
sum rule. 12.153. (a) f* = P!/ TTeeag() h(c). If A is not a hook, each number h(c) in the 
denominator is less than p. Since p is prime, the factor p in the numerator cannot cancel 
with anything in the denominator. (b) (—1)0)-1. 12.157. P(w) has shape (8,5, 5,3, 1). 
So the longest increasing subsequence has length 8, and the longest decreasing subsequence 
has length 5. 12.160. Sketch: Let w,w2---Wmn41 have no increasing subsequence of length 
m+1. For 1<i< mn-+l, let s; be the length of a longest increasing subsequence of w that 
begins with w,;. Argue that there must exist n+ 1 indices i whose corresponding lengths s; 
are all the same, and use this to obtain a decreasing subsequence of length n + 1. 12.161. 
Note det(A) = det(A‘) = det(—A) = (—1)% det(A) = — det(A). So det(A) = 0 (assuming 
the field does not have characteristic two). 12.163. Writing a,; for A(i,7), the Pfaffian is 
€12034456 — €12435 G46 + 012036445 — 013024456 + 413425 046 — 013026045 + 014423456 — 414025036 + 
414026035 — 215423046 + 15424036 — 215426034 + 416023445 — 216424035 + 416025034. 12.166. 
(a) (15243768, 13264857); (b) (14235768, 14235768). 12.167. (a) w = (1,3, 6,8, 2,5)(4, 7) = 
35671842; term is —A(1,3)A(2,5)A(3, 6) A(4, 7)A(1, 5) A(6, 8) A(4, 7) A(2, 8). 12.168. There 
are 817,991 tilings of the 6 x 9 board. 12.169. (a) 144,092. 12.171. To prove property 3, fix 
i,j € {1,2,...,mn}, and write i = m(i, — 1) +t, 7 = m(j1 — 1) + jo where 1 <t1,j1 <n 
and 1 < ig, jg < m. Then the i, j-entry of (A; ® By)(A2 @ Bz) is 7", (A1 ® Bi) (i, k) (Az ® 
Bo)(k,j) = Viyar Vkga1(Ar ® Bi), m(ki — 1) + ke)(A2 @ Bo)(m(ki — 1) + ke, J) = 
hai Dokg=1 Ai (in, Kt) Ba (i2, kz) Ae (ha, j1)Balke,j2) = (Shar Ar (én, hi) Ao(ki,51)) - 
(Seni Bi (iz, ke) Balke, j2)) = (Ar A2)(i1, jr) - (B1-B2)(i2, j2) = (A1A2) @ (B1B2)(i, j). As 
a special case, if A and B are invertible, (A @ B)(A~' @ B~') = (AA~') @ (BB™') = 
I, ® Im = Inm (and similarly in the other order), so A ® B is invertible with inverse 
A~'@ B™!. 12.174. (a) Show both sides are polynomials in u with the same zeroes and 
the same leading term. (b) Set u = 0 in (a). 12.175. Compare the factors indexed by k and 
n+1-—k. For n odd, separately evaluate the product over j when k = (n+ 1)/2. 
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